Harvest for Hone et al Pterosaur Data Created 31 May 18:49

Stage: completed
Fetched: 31 May 18:49
Validated: 31 May 18:49
Deltas Created 31 May 18:49
Units Normalized: 31 May 18:49
Ancestry Built: 31 May 18:49
Nodes Matched: 31 May 18:49
Names Parsed: 31 May 18:49
New Models Stored: 31 May 18:49
Indexed: 31 May 18:49
Completed: 31 May 18:51
Time to Harvest: less than a minute

Harvesting Log

(215 lines)
[INFO] [2021-05-31 18:49:14] Created harvest instance #3946
[STOP] [2021-05-31 18:49:14] create_harvest_instance
[START] [2021-05-31 18:49:14] fetch_files
[STOP] [2021-05-31 18:49:14] fetch_files
[START] [2021-05-31 18:49:14] validate_each_file
[INFO] [2021-05-31 18:49:14] Looping over 8 formats...
[INFO] [2021-05-31 18:49:14] ...agents (/app/public/data/heapd/agents.txt)
[INFO] [2021-05-31 18:49:14] Valid: /app/public/converted_csv/heapd_agents_3946.csv (0 lines)
[INFO] [2021-05-31 18:49:14] ...refs (/app/public/data/heapd/references.txt)
[INFO] [2021-05-31 18:49:14] Valid: /app/public/converted_csv/heapd_refs_3946.csv (0 lines)
[INFO] [2021-05-31 18:49:14] ...nodes (/app/public/data/heapd/taxa.txt)
[INFO] [2021-05-31 18:49:14] Valid: /app/public/converted_csv/heapd_nodes_3946.csv (50 lines)
[INFO] [2021-05-31 18:49:14] ...media (/app/public/data/heapd/media.txt)
[INFO] [2021-05-31 18:49:14] Valid: /app/public/converted_csv/heapd_media_3946.csv (0 lines)
[INFO] [2021-05-31 18:49:14] ...vernaculars (/app/public/data/heapd/common names.txt)
[INFO] [2021-05-31 18:49:14] Valid: /app/public/converted_csv/heapd_vernaculars_3946.csv (0 lines)
[INFO] [2021-05-31 18:49:14] ...occurrences (/app/public/data/heapd/occurrences.txt)
[INFO] [2021-05-31 18:49:14] Valid: /app/public/converted_csv/heapd_occurrences_3946.csv (50 lines)
[INFO] [2021-05-31 18:49:14] ...assocs (/app/public/data/heapd/associations.txt)
[INFO] [2021-05-31 18:49:14] Valid: /app/public/converted_csv/heapd_assocs_3946.csv (0 lines)
[INFO] [2021-05-31 18:49:14] ...measurements (/app/public/data/heapd/measurements or facts.txt)
[INFO] [2021-05-31 18:49:14] Valid: /app/public/converted_csv/heapd_measurements_3946.csv (276 lines)
[STOP] [2021-05-31 18:49:14] validate_each_file
[START] [2021-05-31 18:49:14] convert_to_csv
[INFO] [2021-05-31 18:49:14] Looping over 8 formats...
[INFO] [2021-05-31 18:49:14] ...agents (/app/public/data/heapd/agents.txt)
[CMD] [2021-05-31 18:49:14] /usr/bin/sort /app/public/converted_csv/heapd_agents_3946.csv > /app/public/converted_csv/heapd_agents_3946.csv_sorted
[INFO] [2021-05-31 18:49:15] Converted: /app/public/converted_csv/heapd_agents_3946.csv (0 lines)
[INFO] [2021-05-31 18:49:15] ...refs (/app/public/data/heapd/references.txt)
[CMD] [2021-05-31 18:49:15] /usr/bin/sort /app/public/converted_csv/heapd_refs_3946.csv > /app/public/converted_csv/heapd_refs_3946.csv_sorted
[INFO] [2021-05-31 18:49:15] Converted: /app/public/converted_csv/heapd_refs_3946.csv (0 lines)
[INFO] [2021-05-31 18:49:15] ...nodes (/app/public/data/heapd/taxa.txt)
[CMD] [2021-05-31 18:49:15] /usr/bin/sort /app/public/converted_csv/heapd_nodes_3946.csv > /app/public/converted_csv/heapd_nodes_3946.csv_sorted
[INFO] [2021-05-31 18:49:15] Converted: /app/public/converted_csv/heapd_nodes_3946.csv (50 lines)
[INFO] [2021-05-31 18:49:15] ...media (/app/public/data/heapd/media.txt)
[CMD] [2021-05-31 18:49:15] /usr/bin/sort /app/public/converted_csv/heapd_media_3946.csv > /app/public/converted_csv/heapd_media_3946.csv_sorted
[INFO] [2021-05-31 18:49:16] Converted: /app/public/converted_csv/heapd_media_3946.csv (0 lines)
[INFO] [2021-05-31 18:49:16] ...vernaculars (/app/public/data/heapd/common names.txt)
[CMD] [2021-05-31 18:49:16] /usr/bin/sort /app/public/converted_csv/heapd_vernaculars_3946.csv > /app/public/converted_csv/heapd_vernaculars_3946.csv_sorted
[INFO] [2021-05-31 18:49:16] Converted: /app/public/converted_csv/heapd_vernaculars_3946.csv (0 lines)
[INFO] [2021-05-31 18:49:16] ...occurrences (/app/public/data/heapd/occurrences.txt)
[CMD] [2021-05-31 18:49:16] /usr/bin/sort /app/public/converted_csv/heapd_occurrences_3946.csv > /app/public/converted_csv/heapd_occurrences_3946.csv_sorted
[INFO] [2021-05-31 18:49:17] Converted: /app/public/converted_csv/heapd_occurrences_3946.csv (50 lines)
[INFO] [2021-05-31 18:49:17] ...assocs (/app/public/data/heapd/associations.txt)
[CMD] [2021-05-31 18:49:17] /usr/bin/sort /app/public/converted_csv/heapd_assocs_3946.csv > /app/public/converted_csv/heapd_assocs_3946.csv_sorted
[INFO] [2021-05-31 18:49:17] Converted: /app/public/converted_csv/heapd_assocs_3946.csv (0 lines)
[INFO] [2021-05-31 18:49:17] ...measurements (/app/public/data/heapd/measurements or facts.txt)
[CMD] [2021-05-31 18:49:17] /usr/bin/sort /app/public/converted_csv/heapd_measurements_3946.csv > /app/public/converted_csv/heapd_measurements_3946.csv_sorted
[INFO] [2021-05-31 18:49:17] Converted: /app/public/converted_csv/heapd_measurements_3946.csv (276 lines)
[STOP] [2021-05-31 18:49:17] convert_to_csv
[START] [2021-05-31 18:49:17] calculate_delta
[INFO] [2021-05-31 18:49:17] Looping over 8 formats...
[INFO] [2021-05-31 18:49:17] ...agents (/app/public/data/heapd/agents.txt)
[CMD] [2021-05-31 18:49:17] echo "0a" > /app/public/diff/heapd_agents_3946.diff
[CMD] [2021-05-31 18:49:18] tail -n +1 /app/public/converted_csv/heapd_agents_3946.csv >> /app/public/diff/heapd_agents_3946.diff
[CMD] [2021-05-31 18:49:18] echo "." >> /app/public/diff/heapd_agents_3946.diff
[INFO] [2021-05-31 18:49:18] Created diff: /app/public/diff/heapd_agents_3946.diff (2 lines)
[INFO] [2021-05-31 18:49:18] ...refs (/app/public/data/heapd/references.txt)
[CMD] [2021-05-31 18:49:18] echo "0a" > /app/public/diff/heapd_refs_3946.diff
[CMD] [2021-05-31 18:49:19] tail -n +1 /app/public/converted_csv/heapd_refs_3946.csv >> /app/public/diff/heapd_refs_3946.diff
[CMD] [2021-05-31 18:49:19] echo "." >> /app/public/diff/heapd_refs_3946.diff
[INFO] [2021-05-31 18:49:19] Created diff: /app/public/diff/heapd_refs_3946.diff (2 lines)
[INFO] [2021-05-31 18:49:19] ...nodes (/app/public/data/heapd/taxa.txt)
[CMD] [2021-05-31 18:49:19] echo "0a" > /app/public/diff/heapd_nodes_3946.diff
[CMD] [2021-05-31 18:49:20] tail -n +1 /app/public/converted_csv/heapd_nodes_3946.csv >> /app/public/diff/heapd_nodes_3946.diff
[CMD] [2021-05-31 18:49:20] echo "." >> /app/public/diff/heapd_nodes_3946.diff
[INFO] [2021-05-31 18:49:21] Created diff: /app/public/diff/heapd_nodes_3946.diff (52 lines)
[INFO] [2021-05-31 18:49:21] ...media (/app/public/data/heapd/media.txt)
[CMD] [2021-05-31 18:49:21] echo "0a" > /app/public/diff/heapd_media_3946.diff
[CMD] [2021-05-31 18:49:21] tail -n +1 /app/public/converted_csv/heapd_media_3946.csv >> /app/public/diff/heapd_media_3946.diff
[CMD] [2021-05-31 18:49:21] echo "." >> /app/public/diff/heapd_media_3946.diff
[INFO] [2021-05-31 18:49:22] Created diff: /app/public/diff/heapd_media_3946.diff (2 lines)
[INFO] [2021-05-31 18:49:22] ...vernaculars (/app/public/data/heapd/common names.txt)
[CMD] [2021-05-31 18:49:22] echo "0a" > /app/public/diff/heapd_vernaculars_3946.diff
[CMD] [2021-05-31 18:49:22] tail -n +1 /app/public/converted_csv/heapd_vernaculars_3946.csv >> /app/public/diff/heapd_vernaculars_3946.diff
[CMD] [2021-05-31 18:49:22] echo "." >> /app/public/diff/heapd_vernaculars_3946.diff
[INFO] [2021-05-31 18:49:23] Created diff: /app/public/diff/heapd_vernaculars_3946.diff (2 lines)
[INFO] [2021-05-31 18:49:23] ...occurrences (/app/public/data/heapd/occurrences.txt)
[CMD] [2021-05-31 18:49:23] echo "0a" > /app/public/diff/heapd_occurrences_3946.diff
[CMD] [2021-05-31 18:49:23] tail -n +1 /app/public/converted_csv/heapd_occurrences_3946.csv >> /app/public/diff/heapd_occurrences_3946.diff
[CMD] [2021-05-31 18:49:23] echo "." >> /app/public/diff/heapd_occurrences_3946.diff
[INFO] [2021-05-31 18:49:24] Created diff: /app/public/diff/heapd_occurrences_3946.diff (52 lines)
[INFO] [2021-05-31 18:49:24] ...assocs (/app/public/data/heapd/associations.txt)
[CMD] [2021-05-31 18:49:24] echo "0a" > /app/public/diff/heapd_assocs_3946.diff
[CMD] [2021-05-31 18:49:24] tail -n +1 /app/public/converted_csv/heapd_assocs_3946.csv >> /app/public/diff/heapd_assocs_3946.diff
[CMD] [2021-05-31 18:49:25] echo "." >> /app/public/diff/heapd_assocs_3946.diff
[INFO] [2021-05-31 18:49:25] Created diff: /app/public/diff/heapd_assocs_3946.diff (2 lines)
[INFO] [2021-05-31 18:49:25] ...measurements (/app/public/data/heapd/measurements or facts.txt)
[CMD] [2021-05-31 18:49:25] echo "0a" > /app/public/diff/heapd_measurements_3946.diff
[CMD] [2021-05-31 18:49:25] tail -n +1 /app/public/converted_csv/heapd_measurements_3946.csv >> /app/public/diff/heapd_measurements_3946.diff
[CMD] [2021-05-31 18:49:26] echo "." >> /app/public/diff/heapd_measurements_3946.diff
[INFO] [2021-05-31 18:49:26] Created diff: /app/public/diff/heapd_measurements_3946.diff (278 lines)
[STOP] [2021-05-31 18:49:26] calculate_delta
[START] [2021-05-31 18:49:26] parse_diff_and_store
[INFO] [2021-05-31 18:49:26] Handling diff: /app/public/diff/heapd_agents_3946.diff (2 lines)
[INFO] [2021-05-31 18:49:26] Loading agents diff file into memory (2 /app/public/diff/heapd_agents_3946.diff lines)...
[INFO] [2021-05-31 18:49:27] Handling diff: /app/public/diff/heapd_refs_3946.diff (2 lines)
[INFO] [2021-05-31 18:49:27] Loading refs diff file into memory (2 /app/public/diff/heapd_refs_3946.diff lines)...
[INFO] [2021-05-31 18:49:28] Handling diff: /app/public/diff/heapd_nodes_3946.diff (52 lines)
[INFO] [2021-05-31 18:49:28] Loading nodes diff file into memory (52 /app/public/diff/heapd_nodes_3946.diff lines)...
[INFO] [2021-05-31 18:49:28] Handling diff: /app/public/diff/heapd_media_3946.diff (2 lines)
[INFO] [2021-05-31 18:49:29] Loading media diff file into memory (2 /app/public/diff/heapd_media_3946.diff lines)...
[INFO] [2021-05-31 18:49:29] Handling diff: /app/public/diff/heapd_vernaculars_3946.diff (2 lines)
[INFO] [2021-05-31 18:49:29] Loading vernaculars diff file into memory (2 /app/public/diff/heapd_vernaculars_3946.diff lines)...
[INFO] [2021-05-31 18:49:30] Handling diff: /app/public/diff/heapd_occurrences_3946.diff (52 lines)
[INFO] [2021-05-31 18:49:30] Loading occurrences diff file into memory (52 /app/public/diff/heapd_occurrences_3946.diff lines)...
[INFO] [2021-05-31 18:49:31] Handling diff: /app/public/diff/heapd_assocs_3946.diff (2 lines)
[INFO] [2021-05-31 18:49:31] Loading assocs diff file into memory (2 /app/public/diff/heapd_assocs_3946.diff lines)...
[INFO] [2021-05-31 18:49:31] Handling diff: /app/public/diff/heapd_measurements_3946.diff (278 lines)
[INFO] [2021-05-31 18:49:32] Loading measurements diff file into memory (278 /app/public/diff/heapd_measurements_3946.diff lines)...
[INFO] [2021-05-31 18:49:32] Storing 97 ScientificNames
[INFO] [2021-05-31 18:49:32] Processing group of 97 in 1 groups of 1000
[INFO] [2021-05-31 18:49:32] Average Time: 0.03
[INFO] [2021-05-31 18:49:32] Total Time: 1s
[INFO] [2021-05-31 18:49:32] Storing 97 Nodes
[INFO] [2021-05-31 18:49:32] Processing group of 97 in 1 groups of 1000
[INFO] [2021-05-31 18:49:32] Average Time: 0.03
[INFO] [2021-05-31 18:49:32] Total Time: 1s
[INFO] [2021-05-31 18:49:32] Storing 50 Occurrences
[INFO] [2021-05-31 18:49:32] Processing group of 50 in 1 groups of 1000
[INFO] [2021-05-31 18:49:32] Average Time: 0.01
[INFO] [2021-05-31 18:49:32] Total Time: 1s
[INFO] [2021-05-31 18:49:32] Storing 50 OccurrenceMetadata
[INFO] [2021-05-31 18:49:32] Processing group of 50 in 1 groups of 1000
[INFO] [2021-05-31 18:49:32] Average Time: 0.01
[INFO] [2021-05-31 18:49:32] Total Time: 1s
[INFO] [2021-05-31 18:49:32] Storing 276 Traits
[INFO] [2021-05-31 18:49:32] Processing group of 276 in 1 groups of 1000
[INFO] [2021-05-31 18:49:32] Average Time: 0.08
[INFO] [2021-05-31 18:49:32] Total Time: 1s
[INFO] [2021-05-31 18:49:32] Storing 142 MetaTraits
[INFO] [2021-05-31 18:49:32] Processing group of 142 in 1 groups of 1000
[INFO] [2021-05-31 18:49:32] Average Time: 0.02
[INFO] [2021-05-31 18:49:32] Total Time: 1s
[STOP] [2021-05-31 18:49:32] parse_diff_and_store
[START] [2021-05-31 18:49:32] resolve_keys
[INFO] [2021-05-31 18:49:38] Occurrences to nodes (through scientific_names)...
[INFO] [2021-05-31 18:49:38] traits to occurrences...
[INFO] [2021-05-31 18:49:38] traits to nodes (through occurrences)...
[INFO] [2021-05-31 18:49:38] Traits to sex term...
[INFO] [2021-05-31 18:49:38] Traits to lifestage term...
[INFO] [2021-05-31 18:49:38] MetaTraits to traits...
[INFO] [2021-05-31 18:49:38] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2021-05-31 18:49:38] Assocs to occurrences...
[INFO] [2021-05-31 18:49:38] Assocs to nodes...
[INFO] [2021-05-31 18:49:38] Assoc to sex term...
[INFO] [2021-05-31 18:49:38] Assoc to lifestage term...
[INFO] [2021-05-31 18:49:38] MetaAssoc to assocs...
[STOP] [2021-05-31 18:49:38] resolve_keys
[START] [2021-05-31 18:49:38] hold_for_later_1
[STOP] [2021-05-31 18:49:38] hold_for_later_1
[START] [2021-05-31 18:49:38] hold_for_later_2
[STOP] [2021-05-31 18:49:38] hold_for_later_2
[START] [2021-05-31 18:49:38] resolve_missing_parents
[STOP] [2021-05-31 18:49:38] resolve_missing_parents
[START] [2021-05-31 18:49:38] rebuild_nodes
[START] [2021-05-31 18:49:38] Flattener#flatten
[START] [2021-05-31 18:49:38] Flattener#study_resource
[START] [2021-05-31 18:49:38] Flattener#build_ancestry
[STOP] [2021-05-31 18:49:38] Flattener#build_ancestry
[INFO] [2021-05-31 18:49:38] 97 ancestry keys
[START] [2021-05-31 18:49:38] build_node_ancestors
[INFO] [2021-05-31 18:49:38] old ancestors deleted.
[STOP] [2021-05-31 18:49:38] build_node_ancestors
[START] [2021-05-31 18:49:38] Flattener#propagate_ancestor_ids
[STOP] [2021-05-31 18:49:38] Flattener#propagate_ancestor_ids
[STOP] [2021-05-31 18:49:38] Flattener#flatten
[STOP] [2021-05-31 18:49:38] rebuild_nodes
[START] [2021-05-31 18:49:38] resolve_missing_media_owners
[STOP] [2021-05-31 18:49:38] resolve_missing_media_owners
[START] [2021-05-31 18:49:38] sanitize_media_verbatims
[STOP] [2021-05-31 18:49:38] sanitize_media_verbatims
[START] [2021-05-31 18:49:38] queue_downloads
[STOP] [2021-05-31 18:49:38] queue_downloads
[START] [2021-05-31 18:49:38] parse_names
[WARN] [2021-05-31 18:49:38] I see 97 names which still need to be parsed.
[WARN] [2021-05-31 18:49:40] I see 17 names which still need to be parsed.
[WARN] [2021-05-31 18:49:41] I see 1 names which still need to be parsed.
[STOP] [2021-05-31 18:49:42] parse_names
[START] [2021-05-31 18:49:42] denormalize_canonical_names_to_nodes
[STOP] [2021-05-31 18:49:42] denormalize_canonical_names_to_nodes
[START] [2021-05-31 18:49:42] match_nodes
[START] [2021-05-31 18:49:42] map_all_nodes_to_pages
[STOP] [2021-05-31 18:49:43] map_all_nodes_to_pages
[INFO] [2021-05-31 18:49:43] 14 Unmatched nodes (of 97)! That's too many to output. Full list in /app/public/data/heapd/unmatched_nodes.txt ; First 10: Canonical: Anuroganthus; Node#95129448; ResourceID: Anuroganthus ; Canonical: Anuroganthus; Node#95129449; ResourceID: Anuroganthus; Canonical: Austriadactyus; Node#95129452; ResourceID: Austriadactyus ; Canonical: Austriadactyus; Node#95129453; ResourceID: Austriadactyus; Canonical: Azhdarchoid; Node#95129454; ResourceID: Azhdarchoid; Canonical: Azhdarchoid; Node#95129455; ResourceID: Azhdarchoid; Canonical: Coloborhynchus robustus; Node#95129464; ResourceID: Coloborhynchus_robustus; Canonical: Ctenochasma gracilis; Node#95129467; ResourceID: Ctenochasma_gracilis; Canonical: Germanodactylus rhamphistinus; Node#95129490; ResourceID: Germanodactylus_rhamphistinus; Canonical: Hauxiapterus; Node#95129495; ResourceID: Hauxiapterus
[START] [2021-05-31 18:49:43] update_nodes
[STOP] [2021-05-31 18:49:43] update_nodes
[STOP] [2021-05-31 18:49:43] match_nodes
[START] [2021-05-31 18:49:43] reindex_search
[STOP] [2021-05-31 18:49:43] reindex_search
[START] [2021-05-31 18:49:43] normalize_units
[STOP] [2021-05-31 18:49:44] normalize_units
[START] [2021-05-31 18:49:44] calculate_statistics
[STOP] [2021-05-31 18:49:44] calculate_statistics
[START] [2021-05-31 18:49:44] complete_harvest_instance
[START] [2021-05-31 18:49:44] overall_tsv_creation
[INFO] [2021-05-31 18:49:44] Processing group of 97 in 1 batches of 10000
[INFO] [2021-05-31 18:50:16] 90 Traits (unfiltered)...
[INFO] [2021-05-31 18:50:42] 90 Traits (filtered)...
[INFO] [2021-05-31 18:50:42] 0 Associations (filtered)...
[INFO] [2021-05-31 18:50:42] 180 metadata added.
[INFO] [2021-05-31 18:50:42] 0 metadata added.
[INFO] [2021-05-31 18:51:05] Average Time: 59.75
[INFO] [2021-05-31 18:51:05] Total Time: 1m22s
[STOP] [2021-05-31 18:51:05] overall_tsv_creation
[INFO] [2021-05-31 18:51:05] Done. Check your files:
[INFO] [2021-05-31 18:51:05] (87 lines) /app/public/data/heapd/publish_nodes.tsv
[INFO] [2021-05-31 18:51:06] (127 lines) /app/public/data/heapd/publish_node_ancestors.tsv
[INFO] [2021-05-31 18:51:06] (97 lines) /app/public/data/heapd/publish_scientific_names.tsv
[INFO] [2021-05-31 18:51:06] (91 lines) /app/public/data/heapd/publish_traits.tsv
[INFO] [2021-05-31 18:51:07] (181 lines) /app/public/data/heapd/publish_metadata.tsv
[STOP] [2021-05-31 18:51:07] complete_harvest_instance
[START] [2021-05-31 18:51:07] completed
[STOP] [2021-05-31 18:51:07] completed
[STOP] [2021-05-31 18:51:07] logged process, took 113.07

Latest Process