Harvest for
USFWS
Created
04 Jun 16:59
Stage:
completed
Fetched:
04 Jun 16:59
Validated:
04 Jun 16:59
Deltas Created
04 Jun 16:59
Units Normalized:
04 Jun 17:00
Ancestry Built:
04 Jun 16:59
Nodes Matched:
04 Jun 17:00
Names Parsed:
04 Jun 16:59
New Models Stored:
04 Jun 16:59
Indexed:
04 Jun 17:00
Completed:
04 Jun 17:02
Time to Harvest:
less than a minute
Harvesting Log
(161 lines)
[INFO] [2021-06-04 16:59:31] Created harvest instance #3993
[STOP] [2021-06-04 16:59:31] create_harvest_instance
[START] [2021-06-04 16:59:31] fetch_files
[STOP] [2021-06-04 16:59:31] fetch_files
[START] [2021-06-04 16:59:31] validate_each_file
[INFO] [2021-06-04 16:59:31] Looping over 4 formats...
[INFO] [2021-06-04 16:59:31] ...nodes (/app/public/data/uesldu/taxon.tab)
[INFO] [2021-06-04 16:59:31] Valid: /app/public/converted_csv/uesldu_nodes_3993.csv (2274 lines)
[INFO] [2021-06-04 16:59:31] ...vernaculars (/app/public/data/uesldu/vernacular_name.tab)
[INFO] [2021-06-04 16:59:31] Valid: /app/public/converted_csv/uesldu_vernaculars_3993.csv (2078 lines)
[INFO] [2021-06-04 16:59:31] ...occurrences (/app/public/data/uesldu/occurrence_specific.tab)
[INFO] [2021-06-04 16:59:31] Valid: /app/public/converted_csv/uesldu_occurrences_3993.csv (2293 lines)
[INFO] [2021-06-04 16:59:31] ...measurements (/app/public/data/uesldu/measurement_or_fact_specific.tab)
[INFO] [2021-06-04 16:59:32] Valid: /app/public/converted_csv/uesldu_measurements_3993.csv (2293 lines)
[STOP] [2021-06-04 16:59:32] validate_each_file
[START] [2021-06-04 16:59:32] convert_to_csv
[INFO] [2021-06-04 16:59:32] Looping over 4 formats...
[INFO] [2021-06-04 16:59:32] ...nodes (/app/public/data/uesldu/taxon.tab)
[CMD] [2021-06-04 16:59:32] /usr/bin/sort /app/public/converted_csv/uesldu_nodes_3993.csv > /app/public/converted_csv/uesldu_nodes_3993.csv_sorted
[INFO] [2021-06-04 16:59:32] Converted: /app/public/converted_csv/uesldu_nodes_3993.csv (2274 lines)
[INFO] [2021-06-04 16:59:32] ...vernaculars (/app/public/data/uesldu/vernacular_name.tab)
[CMD] [2021-06-04 16:59:32] /usr/bin/sort /app/public/converted_csv/uesldu_vernaculars_3993.csv > /app/public/converted_csv/uesldu_vernaculars_3993.csv_sorted
[INFO] [2021-06-04 16:59:32] Converted: /app/public/converted_csv/uesldu_vernaculars_3993.csv (2078 lines)
[INFO] [2021-06-04 16:59:32] ...occurrences (/app/public/data/uesldu/occurrence_specific.tab)
[CMD] [2021-06-04 16:59:32] /usr/bin/sort /app/public/converted_csv/uesldu_occurrences_3993.csv > /app/public/converted_csv/uesldu_occurrences_3993.csv_sorted
[INFO] [2021-06-04 16:59:32] Converted: /app/public/converted_csv/uesldu_occurrences_3993.csv (2293 lines)
[INFO] [2021-06-04 16:59:32] ...measurements (/app/public/data/uesldu/measurement_or_fact_specific.tab)
[CMD] [2021-06-04 16:59:32] /usr/bin/sort /app/public/converted_csv/uesldu_measurements_3993.csv > /app/public/converted_csv/uesldu_measurements_3993.csv_sorted
[INFO] [2021-06-04 16:59:32] Converted: /app/public/converted_csv/uesldu_measurements_3993.csv (2293 lines)
[STOP] [2021-06-04 16:59:32] convert_to_csv
[START] [2021-06-04 16:59:32] calculate_delta
[INFO] [2021-06-04 16:59:32] Looping over 4 formats...
[INFO] [2021-06-04 16:59:32] ...nodes (/app/public/data/uesldu/taxon.tab)
[CMD] [2021-06-04 16:59:32] echo "0a" > /app/public/diff/uesldu_nodes_3993.diff
[CMD] [2021-06-04 16:59:32] tail -n +1 /app/public/converted_csv/uesldu_nodes_3993.csv >> /app/public/diff/uesldu_nodes_3993.diff
[CMD] [2021-06-04 16:59:32] echo "." >> /app/public/diff/uesldu_nodes_3993.diff
[INFO] [2021-06-04 16:59:32] Created diff: /app/public/diff/uesldu_nodes_3993.diff (2276 lines)
[INFO] [2021-06-04 16:59:32] ...vernaculars (/app/public/data/uesldu/vernacular_name.tab)
[CMD] [2021-06-04 16:59:32] echo "0a" > /app/public/diff/uesldu_vernaculars_3993.diff
[CMD] [2021-06-04 16:59:32] tail -n +1 /app/public/converted_csv/uesldu_vernaculars_3993.csv >> /app/public/diff/uesldu_vernaculars_3993.diff
[CMD] [2021-06-04 16:59:32] echo "." >> /app/public/diff/uesldu_vernaculars_3993.diff
[INFO] [2021-06-04 16:59:32] Created diff: /app/public/diff/uesldu_vernaculars_3993.diff (2080 lines)
[INFO] [2021-06-04 16:59:32] ...occurrences (/app/public/data/uesldu/occurrence_specific.tab)
[CMD] [2021-06-04 16:59:32] echo "0a" > /app/public/diff/uesldu_occurrences_3993.diff
[CMD] [2021-06-04 16:59:32] tail -n +1 /app/public/converted_csv/uesldu_occurrences_3993.csv >> /app/public/diff/uesldu_occurrences_3993.diff
[CMD] [2021-06-04 16:59:32] echo "." >> /app/public/diff/uesldu_occurrences_3993.diff
[INFO] [2021-06-04 16:59:32] Created diff: /app/public/diff/uesldu_occurrences_3993.diff (2295 lines)
[INFO] [2021-06-04 16:59:32] ...measurements (/app/public/data/uesldu/measurement_or_fact_specific.tab)
[CMD] [2021-06-04 16:59:32] echo "0a" > /app/public/diff/uesldu_measurements_3993.diff
[CMD] [2021-06-04 16:59:33] tail -n +1 /app/public/converted_csv/uesldu_measurements_3993.csv >> /app/public/diff/uesldu_measurements_3993.diff
[CMD] [2021-06-04 16:59:33] echo "." >> /app/public/diff/uesldu_measurements_3993.diff
[INFO] [2021-06-04 16:59:33] Created diff: /app/public/diff/uesldu_measurements_3993.diff (2295 lines)
[STOP] [2021-06-04 16:59:33] calculate_delta
[START] [2021-06-04 16:59:33] parse_diff_and_store
[INFO] [2021-06-04 16:59:33] Handling diff: /app/public/diff/uesldu_nodes_3993.diff (2276 lines)
[INFO] [2021-06-04 16:59:33] Loading nodes diff file into memory (2276 /app/public/diff/uesldu_nodes_3993.diff lines)...
[INFO] [2021-06-04 16:59:33] Handling diff: /app/public/diff/uesldu_vernaculars_3993.diff (2080 lines)
[INFO] [2021-06-04 16:59:33] Loading vernaculars diff file into memory (2080 /app/public/diff/uesldu_vernaculars_3993.diff lines)...
[INFO] [2021-06-04 16:59:34] Handling diff: /app/public/diff/uesldu_occurrences_3993.diff (2295 lines)
[INFO] [2021-06-04 16:59:34] Loading occurrences diff file into memory (2295 /app/public/diff/uesldu_occurrences_3993.diff lines)...
[INFO] [2021-06-04 16:59:34] Handling diff: /app/public/diff/uesldu_measurements_3993.diff (2295 lines)
[INFO] [2021-06-04 16:59:34] Loading measurements diff file into memory (2295 /app/public/diff/uesldu_measurements_3993.diff lines)...
[INFO] [2021-06-04 16:59:35] Storing 2274 ScientificNames
[INFO] [2021-06-04 16:59:35] Processing group of 2274 in 3 groups of 1000
[INFO] [2021-06-04 16:59:36] Average Time: 0.22
[INFO] [2021-06-04 16:59:36] Total Time: 1s
[INFO] [2021-06-04 16:59:36] Storing 2274 Nodes
[INFO] [2021-06-04 16:59:36] Processing group of 2274 in 3 groups of 1000
[INFO] [2021-06-04 16:59:37] Average Time: 0.27
[INFO] [2021-06-04 16:59:37] Total Time: 1s
[INFO] [2021-06-04 16:59:37] Storing 2078 Vernaculars
[INFO] [2021-06-04 16:59:37] Processing group of 2078 in 3 groups of 1000
[INFO] [2021-06-04 16:59:37] Average Time: 0.113
[INFO] [2021-06-04 16:59:37] Total Time: 1s
[INFO] [2021-06-04 16:59:37] Storing 2293 Occurrences
[INFO] [2021-06-04 16:59:37] Processing group of 2293 in 3 groups of 1000
[INFO] [2021-06-04 16:59:37] Average Time: 0.087
[INFO] [2021-06-04 16:59:37] Total Time: 1s
[INFO] [2021-06-04 16:59:37] Storing 2293 Traits
[INFO] [2021-06-04 16:59:37] Processing group of 2293 in 3 groups of 1000
[INFO] [2021-06-04 16:59:38] Average Time: 0.333
[INFO] [2021-06-04 16:59:38] Total Time: 2s
[STOP] [2021-06-04 16:59:38] parse_diff_and_store
[START] [2021-06-04 16:59:38] resolve_keys
[INFO] [2021-06-04 16:59:49] Occurrences to nodes (through scientific_names)...
[INFO] [2021-06-04 16:59:54] traits to occurrences...
[INFO] [2021-06-04 16:59:54] traits to nodes (through occurrences)...
[INFO] [2021-06-04 16:59:54] Traits to sex term...
[INFO] [2021-06-04 16:59:54] Traits to lifestage term...
[INFO] [2021-06-04 16:59:54] MetaTraits to traits...
[INFO] [2021-06-04 16:59:54] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2021-06-04 16:59:55] Assocs to occurrences...
[INFO] [2021-06-04 16:59:55] Assocs to nodes...
[INFO] [2021-06-04 16:59:55] Assoc to sex term...
[INFO] [2021-06-04 16:59:55] Assoc to lifestage term...
[INFO] [2021-06-04 16:59:55] MetaAssoc to assocs...
[STOP] [2021-06-04 16:59:55] resolve_keys
[START] [2021-06-04 16:59:55] hold_for_later_1
[STOP] [2021-06-04 16:59:55] hold_for_later_1
[START] [2021-06-04 16:59:55] hold_for_later_2
[STOP] [2021-06-04 16:59:55] hold_for_later_2
[START] [2021-06-04 16:59:55] resolve_missing_parents
[STOP] [2021-06-04 16:59:55] resolve_missing_parents
[START] [2021-06-04 16:59:55] rebuild_nodes
[START] [2021-06-04 16:59:55] Flattener#flatten
[START] [2021-06-04 16:59:55] Flattener#study_resource
[START] [2021-06-04 16:59:55] Flattener#build_ancestry
[STOP] [2021-06-04 16:59:55] Flattener#build_ancestry
[INFO] [2021-06-04 16:59:55] 2274 ancestry keys
[START] [2021-06-04 16:59:55] build_node_ancestors
[INFO] [2021-06-04 16:59:55] old ancestors deleted.
[STOP] [2021-06-04 16:59:55] build_node_ancestors
[WARN] [2021-06-04 16:59:55] Flattener: nothing to flatten! (Completely flat resource?)
[STOP] [2021-06-04 16:59:55] Flattener#flatten
[STOP] [2021-06-04 16:59:55] rebuild_nodes
[START] [2021-06-04 16:59:55] resolve_missing_media_owners
[STOP] [2021-06-04 16:59:55] resolve_missing_media_owners
[START] [2021-06-04 16:59:55] sanitize_media_verbatims
[STOP] [2021-06-04 16:59:55] sanitize_media_verbatims
[START] [2021-06-04 16:59:55] queue_downloads
[STOP] [2021-06-04 16:59:55] queue_downloads
[START] [2021-06-04 16:59:55] parse_names
[WARN] [2021-06-04 16:59:55] I see 2274 names which still need to be parsed.
[STOP] [2021-06-04 16:59:57] parse_names
[START] [2021-06-04 16:59:57] denormalize_canonical_names_to_nodes
[STOP] [2021-06-04 16:59:58] denormalize_canonical_names_to_nodes
[START] [2021-06-04 16:59:58] match_nodes
[START] [2021-06-04 16:59:58] map_all_nodes_to_pages
[STOP] [2021-06-04 17:00:27] map_all_nodes_to_pages
[INFO] [2021-06-04 17:00:27] 199 Unmatched nodes (of 2274)! That's too many to output. Full list in /app/public/data/uesldu/unmatched_nodes.txt ; First 10: Canonical: Nototrichium humile; Node#95661532; ResourceID: 1001; Canonical: Pyrgulopsis trivialis; Node#95661536; ResourceID: 1017; Canonical: Troides alexandrae; Node#95661539; ResourceID: 1034; Canonical: Falco araea; Node#95661583; ResourceID: 1120; Canonical: Herpailurus; Node#95661587; ResourceID: 1124; Canonical: Grus canadensis pulla; Node#95661605; ResourceID: 1222; Canonical: Schiedea trinervis; Node#95661634; ResourceID: 1324; Canonical: Mimizuku; Node#95661664; ResourceID: 1471; Canonical: Newcombia cumingi; Node#95661695; ResourceID: 1529; Canonical: Rhaphiomidas terminatus abdominalis; Node#95661699; ResourceID: 1540
[START] [2021-06-04 17:00:27] update_nodes
[STOP] [2021-06-04 17:00:28] update_nodes
[STOP] [2021-06-04 17:00:28] match_nodes
[START] [2021-06-04 17:00:28] reindex_search
[STOP] [2021-06-04 17:00:29] reindex_search
[START] [2021-06-04 17:00:29] normalize_units
[STOP] [2021-06-04 17:00:29] normalize_units
[START] [2021-06-04 17:00:29] calculate_statistics
[2021-06-04 17:00:29] ZERO NODE ANCESTORS. Is this actually a completely flat resource?
[STOP] [2021-06-04 17:00:29] calculate_statistics
[START] [2021-06-04 17:00:29] complete_harvest_instance
[START] [2021-06-04 17:00:29] overall_tsv_creation
[INFO] [2021-06-04 17:00:29] Processing group of 2274 in 1 batches of 10000
[INFO] [2021-06-04 17:01:12] 2293 Traits (unfiltered)...
[INFO] [2021-06-04 17:01:58] 2293 Traits (filtered)...
[INFO] [2021-06-04 17:01:58] 0 Associations (filtered)...
[INFO] [2021-06-04 17:01:58] 0 metadata added.
[INFO] [2021-06-04 17:01:58] 0 metadata added.
[INFO] [2021-06-04 17:02:25] Average Time: 91.88
[INFO] [2021-06-04 17:02:25] Total Time: 1m56s
[STOP] [2021-06-04 17:02:25] overall_tsv_creation
[INFO] [2021-06-04 17:02:25] Done. Check your files:
[INFO] [2021-06-04 17:02:25] (2274 lines) /app/public/data/uesldu/publish_nodes.tsv
[INFO] [2021-06-04 17:02:25] (2274 lines) /app/public/data/uesldu/publish_scientific_names.tsv
[INFO] [2021-06-04 17:02:25] (2078 lines) /app/public/data/uesldu/publish_vernaculars.tsv
[INFO] [2021-06-04 17:02:25] (2294 lines) /app/public/data/uesldu/publish_traits.tsv
[INFO] [2021-06-04 17:02:25] (1 lines) /app/public/data/uesldu/publish_metadata.tsv
[STOP] [2021-06-04 17:02:25] complete_harvest_instance
[START] [2021-06-04 17:02:25] completed
[STOP] [2021-06-04 17:02:25] completed
[STOP] [2021-06-04 17:02:25] logged process, took 174.37
Latest Process