Harvest for Aruba Species List Created 02 Jun 12:05

Stage: completed
Fetched: 02 Jun 12:05
Validated: 02 Jun 12:05
Deltas Created 02 Jun 12:05
Units Normalized: 02 Jun 12:09
Ancestry Built: 02 Jun 12:06
Nodes Matched: 02 Jun 12:09
Names Parsed: 02 Jun 12:06
New Models Stored: 02 Jun 12:05
Indexed: 02 Jun 12:09
Completed: 02 Jun 12:11
Time to Harvest: less than a minute

Harvesting Log

(172 lines)
[INFO] [2021-06-02 12:05:15] Created harvest instance #3982
[STOP] [2021-06-02 12:05:15] create_harvest_instance
[START] [2021-06-02 12:05:15] fetch_files
[STOP] [2021-06-02 12:05:15] fetch_files
[START] [2021-06-02 12:05:15] validate_each_file
[INFO] [2021-06-02 12:05:15] Looping over 4 formats...
[INFO] [2021-06-02 12:05:15] ...refs (/app/public/data/aruba_sp_list/reference.tab)
[INFO] [2021-06-02 12:05:15] Valid: /app/public/converted_csv/aruba_sp_list_refs_3982.csv (110 lines)
[INFO] [2021-06-02 12:05:15] ...nodes (/app/public/data/aruba_sp_list/taxon.tab)
[INFO] [2021-06-02 12:05:15] Valid: /app/public/converted_csv/aruba_sp_list_nodes_3982.csv (2229 lines)
[INFO] [2021-06-02 12:05:15] ...occurrences (/app/public/data/aruba_sp_list/occurrence_specific.tab)
[INFO] [2021-06-02 12:05:15] Valid: /app/public/converted_csv/aruba_sp_list_occurrences_3982.csv (2229 lines)
[INFO] [2021-06-02 12:05:15] ...measurements (/app/public/data/aruba_sp_list/measurement_or_fact_specific.tab)
[INFO] [2021-06-02 12:05:16] Valid: /app/public/converted_csv/aruba_sp_list_measurements_3982.csv (7413 lines)
[STOP] [2021-06-02 12:05:16] validate_each_file
[START] [2021-06-02 12:05:16] convert_to_csv
[INFO] [2021-06-02 12:05:16] Looping over 4 formats...
[INFO] [2021-06-02 12:05:16] ...refs (/app/public/data/aruba_sp_list/reference.tab)
[CMD] [2021-06-02 12:05:16] /usr/bin/sort /app/public/converted_csv/aruba_sp_list_refs_3982.csv > /app/public/converted_csv/aruba_sp_list_refs_3982.csv_sorted
[INFO] [2021-06-02 12:05:17] Converted: /app/public/converted_csv/aruba_sp_list_refs_3982.csv (110 lines)
[INFO] [2021-06-02 12:05:17] ...nodes (/app/public/data/aruba_sp_list/taxon.tab)
[CMD] [2021-06-02 12:05:17] /usr/bin/sort /app/public/converted_csv/aruba_sp_list_nodes_3982.csv > /app/public/converted_csv/aruba_sp_list_nodes_3982.csv_sorted
[INFO] [2021-06-02 12:05:18] Converted: /app/public/converted_csv/aruba_sp_list_nodes_3982.csv (2229 lines)
[INFO] [2021-06-02 12:05:18] ...occurrences (/app/public/data/aruba_sp_list/occurrence_specific.tab)
[CMD] [2021-06-02 12:05:18] /usr/bin/sort /app/public/converted_csv/aruba_sp_list_occurrences_3982.csv > /app/public/converted_csv/aruba_sp_list_occurrences_3982.csv_sorted
[INFO] [2021-06-02 12:05:19] Converted: /app/public/converted_csv/aruba_sp_list_occurrences_3982.csv (2229 lines)
[INFO] [2021-06-02 12:05:19] ...measurements (/app/public/data/aruba_sp_list/measurement_or_fact_specific.tab)
[CMD] [2021-06-02 12:05:19] /usr/bin/sort /app/public/converted_csv/aruba_sp_list_measurements_3982.csv > /app/public/converted_csv/aruba_sp_list_measurements_3982.csv_sorted
[INFO] [2021-06-02 12:05:20] Converted: /app/public/converted_csv/aruba_sp_list_measurements_3982.csv (7413 lines)
[STOP] [2021-06-02 12:05:20] convert_to_csv
[START] [2021-06-02 12:05:20] calculate_delta
[INFO] [2021-06-02 12:05:20] Looping over 4 formats...
[INFO] [2021-06-02 12:05:20] ...refs (/app/public/data/aruba_sp_list/reference.tab)
[CMD] [2021-06-02 12:05:20] echo "0a" > /app/public/diff/aruba_sp_list_refs_3982.diff
[CMD] [2021-06-02 12:05:21] tail -n +1 /app/public/converted_csv/aruba_sp_list_refs_3982.csv >> /app/public/diff/aruba_sp_list_refs_3982.diff
[CMD] [2021-06-02 12:05:22] echo "." >> /app/public/diff/aruba_sp_list_refs_3982.diff
[INFO] [2021-06-02 12:05:23] Created diff: /app/public/diff/aruba_sp_list_refs_3982.diff (112 lines)
[INFO] [2021-06-02 12:05:23] ...nodes (/app/public/data/aruba_sp_list/taxon.tab)
[CMD] [2021-06-02 12:05:23] echo "0a" > /app/public/diff/aruba_sp_list_nodes_3982.diff
[CMD] [2021-06-02 12:05:24] tail -n +1 /app/public/converted_csv/aruba_sp_list_nodes_3982.csv >> /app/public/diff/aruba_sp_list_nodes_3982.diff
[CMD] [2021-06-02 12:05:25] echo "." >> /app/public/diff/aruba_sp_list_nodes_3982.diff
[INFO] [2021-06-02 12:05:27] Created diff: /app/public/diff/aruba_sp_list_nodes_3982.diff (2231 lines)
[INFO] [2021-06-02 12:05:27] ...occurrences (/app/public/data/aruba_sp_list/occurrence_specific.tab)
[CMD] [2021-06-02 12:05:27] echo "0a" > /app/public/diff/aruba_sp_list_occurrences_3982.diff
[CMD] [2021-06-02 12:05:28] tail -n +1 /app/public/converted_csv/aruba_sp_list_occurrences_3982.csv >> /app/public/diff/aruba_sp_list_occurrences_3982.diff
[CMD] [2021-06-02 12:05:29] echo "." >> /app/public/diff/aruba_sp_list_occurrences_3982.diff
[INFO] [2021-06-02 12:05:30] Created diff: /app/public/diff/aruba_sp_list_occurrences_3982.diff (2231 lines)
[INFO] [2021-06-02 12:05:30] ...measurements (/app/public/data/aruba_sp_list/measurement_or_fact_specific.tab)
[CMD] [2021-06-02 12:05:30] echo "0a" > /app/public/diff/aruba_sp_list_measurements_3982.diff
[CMD] [2021-06-02 12:05:31] tail -n +1 /app/public/converted_csv/aruba_sp_list_measurements_3982.csv >> /app/public/diff/aruba_sp_list_measurements_3982.diff
[CMD] [2021-06-02 12:05:32] echo "." >> /app/public/diff/aruba_sp_list_measurements_3982.diff
[INFO] [2021-06-02 12:05:33] Created diff: /app/public/diff/aruba_sp_list_measurements_3982.diff (7415 lines)
[STOP] [2021-06-02 12:05:33] calculate_delta
[START] [2021-06-02 12:05:33] parse_diff_and_store
[INFO] [2021-06-02 12:05:33] Handling diff: /app/public/diff/aruba_sp_list_refs_3982.diff (112 lines)
[INFO] [2021-06-02 12:05:34] Loading refs diff file into memory (112 /app/public/diff/aruba_sp_list_refs_3982.diff lines)...
[INFO] [2021-06-02 12:05:35] Handling diff: /app/public/diff/aruba_sp_list_nodes_3982.diff (2231 lines)
[INFO] [2021-06-02 12:05:36] Loading nodes diff file into memory (2231 /app/public/diff/aruba_sp_list_nodes_3982.diff lines)...
[INFO] [2021-06-02 12:05:39] Handling diff: /app/public/diff/aruba_sp_list_occurrences_3982.diff (2231 lines)
[INFO] [2021-06-02 12:05:40] Loading occurrences diff file into memory (2231 /app/public/diff/aruba_sp_list_occurrences_3982.diff lines)...
[INFO] [2021-06-02 12:05:41] Handling diff: /app/public/diff/aruba_sp_list_measurements_3982.diff (7415 lines)
[INFO] [2021-06-02 12:05:42] Loading measurements diff file into memory (7415 /app/public/diff/aruba_sp_list_measurements_3982.diff lines)...
[INFO] [2021-06-02 12:05:45] Storing 110 References
[INFO] [2021-06-02 12:05:45] Processing group of 110 in 1 groups of 1000
[INFO] [2021-06-02 12:05:45] Average Time: 0.04
[INFO] [2021-06-02 12:05:45] Total Time: 1s
[INFO] [2021-06-02 12:05:45] Storing 4400 ScientificNames
[INFO] [2021-06-02 12:05:45] Processing group of 4400 in 5 groups of 1000
[INFO] [2021-06-02 12:05:47] Average Time: 0.32
[INFO] [2021-06-02 12:05:47] Total Time: 2s
[INFO] [2021-06-02 12:05:47] Storing 4400 Nodes
[INFO] [2021-06-02 12:05:47] Processing group of 4400 in 5 groups of 1000
[INFO] [2021-06-02 12:05:48] Average Time: 0.246
[INFO] [2021-06-02 12:05:48] Total Time: 2s
[INFO] [2021-06-02 12:05:48] Storing 2229 Occurrences
[INFO] [2021-06-02 12:05:48] Processing group of 2229 in 3 groups of 1000
[INFO] [2021-06-02 12:05:49] Average Time: 0.09
[INFO] [2021-06-02 12:05:49] Total Time: 1s
[INFO] [2021-06-02 12:05:49] Storing 7413 Traits
[INFO] [2021-06-02 12:05:49] Processing group of 7413 in 8 groups of 1000
[INFO] [2021-06-02 12:05:51] Average Time: 0.3
[INFO] [2021-06-02 12:05:51] Total Time: 3s
[INFO] [2021-06-02 12:05:51] last 3 / first 3: 0.78
[INFO] [2021-06-02 12:05:51] Std.Dev: 0.07745966692414834; Max: 0.4
[INFO] [2021-06-02 12:05:51] Storing 2955 TraitsReferences
[INFO] [2021-06-02 12:05:51] Processing group of 2955 in 3 groups of 1000
[INFO] [2021-06-02 12:05:51] Average Time: 0.067
[INFO] [2021-06-02 12:05:51] Total Time: 1s
[INFO] [2021-06-02 12:05:51] Storing 2229 MetaTraits
[INFO] [2021-06-02 12:05:51] Processing group of 2229 in 3 groups of 1000
[INFO] [2021-06-02 12:05:52] Average Time: 0.107
[INFO] [2021-06-02 12:05:52] Total Time: 1s
[STOP] [2021-06-02 12:05:52] parse_diff_and_store
[START] [2021-06-02 12:05:52] resolve_keys
[INFO] [2021-06-02 12:06:04] Occurrences to nodes (through scientific_names)...
[INFO] [2021-06-02 12:06:06] traits to occurrences...
[INFO] [2021-06-02 12:06:06] traits to nodes (through occurrences)...
[INFO] [2021-06-02 12:06:06] Traits to sex term...
[INFO] [2021-06-02 12:06:06] Traits to lifestage term...
[INFO] [2021-06-02 12:06:07] MetaTraits to traits...
[INFO] [2021-06-02 12:06:07] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2021-06-02 12:06:07] Assocs to occurrences...
[INFO] [2021-06-02 12:06:07] Assocs to nodes...
[INFO] [2021-06-02 12:06:07] Assoc to sex term...
[INFO] [2021-06-02 12:06:07] Assoc to lifestage term...
[INFO] [2021-06-02 12:06:07] MetaAssoc to assocs...
[STOP] [2021-06-02 12:06:07] resolve_keys
[START] [2021-06-02 12:06:07] hold_for_later_1
[STOP] [2021-06-02 12:06:07] hold_for_later_1
[START] [2021-06-02 12:06:07] hold_for_later_2
[STOP] [2021-06-02 12:06:07] hold_for_later_2
[START] [2021-06-02 12:06:07] resolve_missing_parents
[STOP] [2021-06-02 12:06:07] resolve_missing_parents
[START] [2021-06-02 12:06:07] rebuild_nodes
[START] [2021-06-02 12:06:07] Flattener#flatten
[START] [2021-06-02 12:06:07] Flattener#study_resource
[START] [2021-06-02 12:06:07] Flattener#build_ancestry
[STOP] [2021-06-02 12:06:07] Flattener#build_ancestry
[INFO] [2021-06-02 12:06:07] 4400 ancestry keys
[START] [2021-06-02 12:06:07] build_node_ancestors
[INFO] [2021-06-02 12:06:07] old ancestors deleted.
[STOP] [2021-06-02 12:06:08] build_node_ancestors
[START] [2021-06-02 12:06:09] Flattener#propagate_ancestor_ids
[STOP] [2021-06-02 12:06:10] Flattener#propagate_ancestor_ids
[STOP] [2021-06-02 12:06:10] Flattener#flatten
[STOP] [2021-06-02 12:06:10] rebuild_nodes
[START] [2021-06-02 12:06:10] resolve_missing_media_owners
[STOP] [2021-06-02 12:06:10] resolve_missing_media_owners
[START] [2021-06-02 12:06:10] sanitize_media_verbatims
[STOP] [2021-06-02 12:06:10] sanitize_media_verbatims
[START] [2021-06-02 12:06:10] queue_downloads
[STOP] [2021-06-02 12:06:10] queue_downloads
[START] [2021-06-02 12:06:10] parse_names
[WARN] [2021-06-02 12:06:10] I see 4400 names which still need to be parsed.
[WARN] [2021-06-02 12:06:14] I see 38 names which still need to be parsed.
[STOP] [2021-06-02 12:06:15] parse_names
[START] [2021-06-02 12:06:15] denormalize_canonical_names_to_nodes
[STOP] [2021-06-02 12:06:15] denormalize_canonical_names_to_nodes
[START] [2021-06-02 12:06:16] match_nodes
[START] [2021-06-02 12:06:16] map_all_nodes_to_pages
[STOP] [2021-06-02 12:09:25] map_all_nodes_to_pages
[INFO] [2021-06-02 12:09:25] 190 Unmatched nodes (of 4400)! That's too many to output. Full list in /app/public/data/aruba_sp_list/unmatched_nodes.txt ; First 10: Canonical: Candona; Node#95608688; ResourceID: 3254823; Canonical: Odontomachus; Node#95606065; ResourceID: 1317162; Canonical: Pheidole; Node#95606066; ResourceID: 1321654; Canonical: Crematogaster obscurata; Node#95610096; ResourceID: 7714306; Canonical: Centris; Node#95606081; ResourceID: 1342796; Canonical: Hylaeus; Node#95606083; ResourceID: 1349360; Canonical: Appias celestina barea; Node#95610281; ResourceID: 8788871; Canonical: Triatoma pseudomaculata; Node#95610237; ResourceID: 8479618; Canonical: Melita; Node#95606202; ResourceID: 2215940; Canonical: Maera; Node#95606204; ResourceID: 2216013
[START] [2021-06-02 12:09:25] update_nodes
[STOP] [2021-06-02 12:09:27] update_nodes
[STOP] [2021-06-02 12:09:27] match_nodes
[START] [2021-06-02 12:09:27] reindex_search
[STOP] [2021-06-02 12:09:31] reindex_search
[START] [2021-06-02 12:09:31] normalize_units
[STOP] [2021-06-02 12:09:31] normalize_units
[START] [2021-06-02 12:09:31] calculate_statistics
[STOP] [2021-06-02 12:09:31] calculate_statistics
[START] [2021-06-02 12:09:31] complete_harvest_instance
[START] [2021-06-02 12:09:31] overall_tsv_creation
[INFO] [2021-06-02 12:09:31] Processing group of 4400 in 1 batches of 10000
[INFO] [2021-06-02 12:10:22] 2229 Traits (unfiltered)...
[INFO] [2021-06-02 12:11:06] 2229 Traits (filtered)...
[INFO] [2021-06-02 12:11:09] 0 Associations (filtered)...
[INFO] [2021-06-02 12:11:10] 5910 metadata added.
[INFO] [2021-06-02 12:11:10] 0 metadata added.
[INFO] [2021-06-02 12:11:36] Average Time: 96.35
[INFO] [2021-06-02 12:11:36] Total Time: 2m5s
[STOP] [2021-06-02 12:11:36] overall_tsv_creation
[INFO] [2021-06-02 12:11:36] Done. Check your files:
[INFO] [2021-06-02 12:11:37] (4396 lines) /app/public/data/aruba_sp_list/publish_nodes.tsv
[INFO] [2021-06-02 12:11:38] (22274 lines) /app/public/data/aruba_sp_list/publish_node_ancestors.tsv
[INFO] [2021-06-02 12:11:39] (4400 lines) /app/public/data/aruba_sp_list/publish_scientific_names.tsv
[INFO] [2021-06-02 12:11:40] (2230 lines) /app/public/data/aruba_sp_list/publish_traits.tsv
[INFO] [2021-06-02 12:11:41] (5911 lines) /app/public/data/aruba_sp_list/publish_metadata.tsv
[STOP] [2021-06-02 12:11:41] complete_harvest_instance
[START] [2021-06-02 12:11:41] completed
[STOP] [2021-06-02 12:11:41] completed
[STOP] [2021-06-02 12:11:41] logged process, took 386.86

Latest Process