Harvest for Hughes, 1987 Created 13 Oct 11:34

Stage: completed
Fetched: 13 Oct 11:34
Validated: 13 Oct 11:34
Deltas Created 13 Oct 11:34
Units Normalized: 13 Oct 11:34
Ancestry Built: 13 Oct 11:34
Nodes Matched: 13 Oct 11:34
Names Parsed: 13 Oct 11:34
New Models Stored: 13 Oct 11:34
Indexed: 13 Oct 11:34
Completed: 13 Oct 11:35
Time to Harvest: less than a minute

Harvesting Log

(206 lines)
[INFO] [2023-10-13 11:34:33] Created harvest instance #4437
[STOP] [2023-10-13 11:34:33] create_harvest_instance
[START] [2023-10-13 11:34:33] fetch_files
[STOP] [2023-10-13 11:34:33] fetch_files
[START] [2023-10-13 11:34:33] validate_each_file
[INFO] [2023-10-13 11:34:33] Looping over 8 formats...
[INFO] [2023-10-13 11:34:33] ...agents (/app/public/data/Hughes/agents.txt)
[INFO] [2023-10-13 11:34:33] Valid: /app/public/data/Hughes/converted_csv/Hughes_agents_30725.csv (0 lines)
[INFO] [2023-10-13 11:34:33] ...refs (/app/public/data/Hughes/references.txt)
[INFO] [2023-10-13 11:34:33] Valid: /app/public/data/Hughes/converted_csv/Hughes_refs_30724.csv (23 lines)
[INFO] [2023-10-13 11:34:33] ...nodes (/app/public/data/Hughes/taxa.txt)
[INFO] [2023-10-13 11:34:33] Valid: /app/public/data/Hughes/converted_csv/Hughes_nodes_30722.csv (31 lines)
[INFO] [2023-10-13 11:34:33] ...media (/app/public/data/Hughes/media.txt)
[INFO] [2023-10-13 11:34:33] Valid: /app/public/data/Hughes/converted_csv/Hughes_media_30721.csv (0 lines)
[INFO] [2023-10-13 11:34:33] ...vernaculars (/app/public/data/Hughes/common names.txt)
[INFO] [2023-10-13 11:34:33] Valid: /app/public/data/Hughes/converted_csv/Hughes_vernaculars_30723.csv (0 lines)
[INFO] [2023-10-13 11:34:33] ...occurrences (/app/public/data/Hughes/occurrences.txt)
[INFO] [2023-10-13 11:34:33] Valid: /app/public/data/Hughes/converted_csv/Hughes_occurrences_30726.csv (95 lines)
[INFO] [2023-10-13 11:34:33] ...assocs (/app/public/data/Hughes/associations.txt)
[INFO] [2023-10-13 11:34:33] Valid: /app/public/data/Hughes/converted_csv/Hughes_assocs_30728.csv (0 lines)
[INFO] [2023-10-13 11:34:33] ...measurements (/app/public/data/Hughes/measurements or facts.txt)
[INFO] [2023-10-13 11:34:33] Valid: /app/public/data/Hughes/converted_csv/Hughes_measurements_30727.csv (315 lines)
[STOP] [2023-10-13 11:34:33] validate_each_file
[START] [2023-10-13 11:34:33] convert_to_csv
[INFO] [2023-10-13 11:34:33] Looping over 8 formats...
[INFO] [2023-10-13 11:34:33] ...agents (/app/public/data/Hughes/agents.txt)
[CMD] [2023-10-13 11:34:33] /usr/bin/sort /app/public/data/Hughes/converted_csv/Hughes_agents_30725.csv > /app/public/data/Hughes/converted_csv/Hughes_agents_30725.csv_sorted
[INFO] [2023-10-13 11:34:33] Converted: /app/public/data/Hughes/converted_csv/Hughes_agents_30725.csv (0 lines)
[INFO] [2023-10-13 11:34:33] ...refs (/app/public/data/Hughes/references.txt)
[CMD] [2023-10-13 11:34:33] /usr/bin/sort /app/public/data/Hughes/converted_csv/Hughes_refs_30724.csv > /app/public/data/Hughes/converted_csv/Hughes_refs_30724.csv_sorted
[INFO] [2023-10-13 11:34:34] Converted: /app/public/data/Hughes/converted_csv/Hughes_refs_30724.csv (23 lines)
[INFO] [2023-10-13 11:34:34] ...nodes (/app/public/data/Hughes/taxa.txt)
[CMD] [2023-10-13 11:34:34] /usr/bin/sort /app/public/data/Hughes/converted_csv/Hughes_nodes_30722.csv > /app/public/data/Hughes/converted_csv/Hughes_nodes_30722.csv_sorted
[INFO] [2023-10-13 11:34:34] Converted: /app/public/data/Hughes/converted_csv/Hughes_nodes_30722.csv (31 lines)
[INFO] [2023-10-13 11:34:34] ...media (/app/public/data/Hughes/media.txt)
[CMD] [2023-10-13 11:34:34] /usr/bin/sort /app/public/data/Hughes/converted_csv/Hughes_media_30721.csv > /app/public/data/Hughes/converted_csv/Hughes_media_30721.csv_sorted
[INFO] [2023-10-13 11:34:34] Converted: /app/public/data/Hughes/converted_csv/Hughes_media_30721.csv (0 lines)
[INFO] [2023-10-13 11:34:34] ...vernaculars (/app/public/data/Hughes/common names.txt)
[CMD] [2023-10-13 11:34:34] /usr/bin/sort /app/public/data/Hughes/converted_csv/Hughes_vernaculars_30723.csv > /app/public/data/Hughes/converted_csv/Hughes_vernaculars_30723.csv_sorted
[INFO] [2023-10-13 11:34:34] Converted: /app/public/data/Hughes/converted_csv/Hughes_vernaculars_30723.csv (0 lines)
[INFO] [2023-10-13 11:34:34] ...occurrences (/app/public/data/Hughes/occurrences.txt)
[CMD] [2023-10-13 11:34:34] /usr/bin/sort /app/public/data/Hughes/converted_csv/Hughes_occurrences_30726.csv > /app/public/data/Hughes/converted_csv/Hughes_occurrences_30726.csv_sorted
[INFO] [2023-10-13 11:34:34] Converted: /app/public/data/Hughes/converted_csv/Hughes_occurrences_30726.csv (95 lines)
[INFO] [2023-10-13 11:34:34] ...assocs (/app/public/data/Hughes/associations.txt)
[CMD] [2023-10-13 11:34:34] /usr/bin/sort /app/public/data/Hughes/converted_csv/Hughes_assocs_30728.csv > /app/public/data/Hughes/converted_csv/Hughes_assocs_30728.csv_sorted
[INFO] [2023-10-13 11:34:34] Converted: /app/public/data/Hughes/converted_csv/Hughes_assocs_30728.csv (0 lines)
[INFO] [2023-10-13 11:34:34] ...measurements (/app/public/data/Hughes/measurements or facts.txt)
[CMD] [2023-10-13 11:34:34] /usr/bin/sort /app/public/data/Hughes/converted_csv/Hughes_measurements_30727.csv > /app/public/data/Hughes/converted_csv/Hughes_measurements_30727.csv_sorted
[INFO] [2023-10-13 11:34:34] Converted: /app/public/data/Hughes/converted_csv/Hughes_measurements_30727.csv (315 lines)
[STOP] [2023-10-13 11:34:34] convert_to_csv
[START] [2023-10-13 11:34:34] calculate_delta
[INFO] [2023-10-13 11:34:34] Looping over 8 formats...
[INFO] [2023-10-13 11:34:34] ...agents (/app/public/data/Hughes/agents.txt)
[CMD] [2023-10-13 11:34:34] echo "0a" > /app/public/data/Hughes/diff/Hughes_agents_30725.diff
[CMD] [2023-10-13 11:34:34] tail -n +1 /app/public/data/Hughes/converted_csv/Hughes_agents_30725.csv >> /app/public/data/Hughes/diff/Hughes_agents_30725.diff
[CMD] [2023-10-13 11:34:34] echo "." >> /app/public/data/Hughes/diff/Hughes_agents_30725.diff
[INFO] [2023-10-13 11:34:34] Created diff: /app/public/data/Hughes/diff/Hughes_agents_30725.diff (2 lines)
[INFO] [2023-10-13 11:34:34] ...refs (/app/public/data/Hughes/references.txt)
[CMD] [2023-10-13 11:34:34] echo "0a" > /app/public/data/Hughes/diff/Hughes_refs_30724.diff
[CMD] [2023-10-13 11:34:34] tail -n +1 /app/public/data/Hughes/converted_csv/Hughes_refs_30724.csv >> /app/public/data/Hughes/diff/Hughes_refs_30724.diff
[CMD] [2023-10-13 11:34:34] echo "." >> /app/public/data/Hughes/diff/Hughes_refs_30724.diff
[INFO] [2023-10-13 11:34:34] Created diff: /app/public/data/Hughes/diff/Hughes_refs_30724.diff (25 lines)
[INFO] [2023-10-13 11:34:34] ...nodes (/app/public/data/Hughes/taxa.txt)
[CMD] [2023-10-13 11:34:34] echo "0a" > /app/public/data/Hughes/diff/Hughes_nodes_30722.diff
[CMD] [2023-10-13 11:34:34] tail -n +1 /app/public/data/Hughes/converted_csv/Hughes_nodes_30722.csv >> /app/public/data/Hughes/diff/Hughes_nodes_30722.diff
[CMD] [2023-10-13 11:34:35] echo "." >> /app/public/data/Hughes/diff/Hughes_nodes_30722.diff
[INFO] [2023-10-13 11:34:35] Created diff: /app/public/data/Hughes/diff/Hughes_nodes_30722.diff (33 lines)
[INFO] [2023-10-13 11:34:35] ...media (/app/public/data/Hughes/media.txt)
[CMD] [2023-10-13 11:34:35] echo "0a" > /app/public/data/Hughes/diff/Hughes_media_30721.diff
[CMD] [2023-10-13 11:34:35] tail -n +1 /app/public/data/Hughes/converted_csv/Hughes_media_30721.csv >> /app/public/data/Hughes/diff/Hughes_media_30721.diff
[CMD] [2023-10-13 11:34:35] echo "." >> /app/public/data/Hughes/diff/Hughes_media_30721.diff
[INFO] [2023-10-13 11:34:35] Created diff: /app/public/data/Hughes/diff/Hughes_media_30721.diff (2 lines)
[INFO] [2023-10-13 11:34:35] ...vernaculars (/app/public/data/Hughes/common names.txt)
[CMD] [2023-10-13 11:34:35] echo "0a" > /app/public/data/Hughes/diff/Hughes_vernaculars_30723.diff
[CMD] [2023-10-13 11:34:35] tail -n +1 /app/public/data/Hughes/converted_csv/Hughes_vernaculars_30723.csv >> /app/public/data/Hughes/diff/Hughes_vernaculars_30723.diff
[CMD] [2023-10-13 11:34:35] echo "." >> /app/public/data/Hughes/diff/Hughes_vernaculars_30723.diff
[INFO] [2023-10-13 11:34:35] Created diff: /app/public/data/Hughes/diff/Hughes_vernaculars_30723.diff (2 lines)
[INFO] [2023-10-13 11:34:35] ...occurrences (/app/public/data/Hughes/occurrences.txt)
[CMD] [2023-10-13 11:34:35] echo "0a" > /app/public/data/Hughes/diff/Hughes_occurrences_30726.diff
[CMD] [2023-10-13 11:34:35] tail -n +1 /app/public/data/Hughes/converted_csv/Hughes_occurrences_30726.csv >> /app/public/data/Hughes/diff/Hughes_occurrences_30726.diff
[CMD] [2023-10-13 11:34:35] echo "." >> /app/public/data/Hughes/diff/Hughes_occurrences_30726.diff
[INFO] [2023-10-13 11:34:35] Created diff: /app/public/data/Hughes/diff/Hughes_occurrences_30726.diff (97 lines)
[INFO] [2023-10-13 11:34:35] ...assocs (/app/public/data/Hughes/associations.txt)
[CMD] [2023-10-13 11:34:35] echo "0a" > /app/public/data/Hughes/diff/Hughes_assocs_30728.diff
[CMD] [2023-10-13 11:34:35] tail -n +1 /app/public/data/Hughes/converted_csv/Hughes_assocs_30728.csv >> /app/public/data/Hughes/diff/Hughes_assocs_30728.diff
[CMD] [2023-10-13 11:34:35] echo "." >> /app/public/data/Hughes/diff/Hughes_assocs_30728.diff
[INFO] [2023-10-13 11:34:35] Created diff: /app/public/data/Hughes/diff/Hughes_assocs_30728.diff (2 lines)
[INFO] [2023-10-13 11:34:35] ...measurements (/app/public/data/Hughes/measurements or facts.txt)
[CMD] [2023-10-13 11:34:35] echo "0a" > /app/public/data/Hughes/diff/Hughes_measurements_30727.diff
[CMD] [2023-10-13 11:34:36] tail -n +1 /app/public/data/Hughes/converted_csv/Hughes_measurements_30727.csv >> /app/public/data/Hughes/diff/Hughes_measurements_30727.diff
[CMD] [2023-10-13 11:34:36] echo "." >> /app/public/data/Hughes/diff/Hughes_measurements_30727.diff
[INFO] [2023-10-13 11:34:36] Created diff: /app/public/data/Hughes/diff/Hughes_measurements_30727.diff (317 lines)
[STOP] [2023-10-13 11:34:36] calculate_delta
[START] [2023-10-13 11:34:36] parse_diff_and_store
[INFO] [2023-10-13 11:34:36] Handling diff: /app/public/data/Hughes/diff/Hughes_agents_30725.diff (2 lines)
[INFO] [2023-10-13 11:34:36] Loading agents diff file into memory (2 lines)...
[INFO] [2023-10-13 11:34:36] Handling diff: /app/public/data/Hughes/diff/Hughes_refs_30724.diff (25 lines)
[INFO] [2023-10-13 11:34:36] Loading refs diff file into memory (25 lines)...
[INFO] [2023-10-13 11:34:36] Storing 23 References (23/23/25)
[INFO] [2023-10-13 11:34:36] Handling diff: /app/public/data/Hughes/diff/Hughes_nodes_30722.diff (33 lines)
[INFO] [2023-10-13 11:34:36] Loading nodes diff file into memory (33 lines)...
[INFO] [2023-10-13 11:34:36] Storing 59 ScientificNames (118/31/33)
[INFO] [2023-10-13 11:34:36] Storing 59 Nodes (118/31/33)
[INFO] [2023-10-13 11:34:36] Handling diff: /app/public/data/Hughes/diff/Hughes_media_30721.diff (2 lines)
[INFO] [2023-10-13 11:34:36] Loading media diff file into memory (2 lines)...
[INFO] [2023-10-13 11:34:36] Handling diff: /app/public/data/Hughes/diff/Hughes_vernaculars_30723.diff (2 lines)
[INFO] [2023-10-13 11:34:36] Loading vernaculars diff file into memory (2 lines)...
[INFO] [2023-10-13 11:34:36] Handling diff: /app/public/data/Hughes/diff/Hughes_occurrences_30726.diff (97 lines)
[INFO] [2023-10-13 11:34:36] Loading occurrences diff file into memory (97 lines)...
[INFO] [2023-10-13 11:34:36] Storing 95 Occurrences (190/95/97)
[INFO] [2023-10-13 11:34:36] Storing 95 OccurrenceMetadata (190/95/97)
[INFO] [2023-10-13 11:34:36] Handling diff: /app/public/data/Hughes/diff/Hughes_assocs_30728.diff (2 lines)
[INFO] [2023-10-13 11:34:36] Loading assocs diff file into memory (2 lines)...
[INFO] [2023-10-13 11:34:36] Handling diff: /app/public/data/Hughes/diff/Hughes_measurements_30727.diff (317 lines)
[INFO] [2023-10-13 11:34:36] Loading measurements diff file into memory (317 lines)...
[INFO] [2023-10-13 11:34:37] Storing 315 Traits (516/315/317)
[INFO] [2023-10-13 11:34:37] Storing 140 MetaTraits (516/315/317)
[INFO] [2023-10-13 11:34:37] Storing 61 TraitsReferences (516/315/317)
[STOP] [2023-10-13 11:34:37] parse_diff_and_store
[START] [2023-10-13 11:34:37] resolve_keys
[2023-10-13 11:34:37] Resolving downloaded urls (this is not actually downloading them yet)
[INFO] [2023-10-13 11:34:44] Occurrences to nodes (through scientific_names)...
[INFO] [2023-10-13 11:34:44] traits to occurrences...
[INFO] [2023-10-13 11:34:44] traits to nodes (through occurrences)...
[INFO] [2023-10-13 11:34:44] Traits to sex term...
[INFO] [2023-10-13 11:34:44] Traits to lifestage term...
[INFO] [2023-10-13 11:34:44] MetaTraits to traits...
[INFO] [2023-10-13 11:34:44] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2023-10-13 11:34:45] Assocs to occurrences...
[INFO] [2023-10-13 11:34:45] Assocs to nodes...
[INFO] [2023-10-13 11:34:45] Assoc to sex term...
[INFO] [2023-10-13 11:34:45] Assoc to lifestage term...
[INFO] [2023-10-13 11:34:45] MetaAssoc to assocs...
[STOP] [2023-10-13 11:34:45] resolve_keys
[START] [2023-10-13 11:34:45] hold_for_later_1
[STOP] [2023-10-13 11:34:45] hold_for_later_1
[START] [2023-10-13 11:34:45] hold_for_later_2
[STOP] [2023-10-13 11:34:45] hold_for_later_2
[START] [2023-10-13 11:34:45] resolve_missing_parents
[STOP] [2023-10-13 11:34:45] resolve_missing_parents
[START] [2023-10-13 11:34:45] rebuild_nodes
[START] [2023-10-13 11:34:45] Flattener#flatten
[START] [2023-10-13 11:34:45] Flattener#study_resource
[START] [2023-10-13 11:34:45] Flattener#build_ancestry
[STOP] [2023-10-13 11:34:45] Flattener#build_ancestry
[INFO] [2023-10-13 11:34:45] 59 ancestry keys
[START] [2023-10-13 11:34:45] build_node_ancestors
[INFO] [2023-10-13 11:34:45] old ancestors deleted.
[STOP] [2023-10-13 11:34:45] build_node_ancestors
[START] [2023-10-13 11:34:45] Flattener#propagate_ancestor_ids
[STOP] [2023-10-13 11:34:45] Flattener#propagate_ancestor_ids
[STOP] [2023-10-13 11:34:45] Flattener#flatten
[STOP] [2023-10-13 11:34:45] rebuild_nodes
[START] [2023-10-13 11:34:45] resolve_missing_media_owners
[STOP] [2023-10-13 11:34:45] resolve_missing_media_owners
[START] [2023-10-13 11:34:45] sanitize_media_verbatims
[STOP] [2023-10-13 11:34:45] sanitize_media_verbatims
[START] [2023-10-13 11:34:45] queue_downloads
[STOP] [2023-10-13 11:34:45] queue_downloads
[START] [2023-10-13 11:34:45] parse_names
[WARN] [2023-10-13 11:34:45] I see 59 names which still need to be parsed.
[WARN] [2023-10-13 11:34:45] Names to parse: 59 formatted: 59 learned: 57 parsed: 59
[STOP] [2023-10-13 11:34:46] parse_names
[START] [2023-10-13 11:34:46] denormalize_canonical_names_to_nodes
[STOP] [2023-10-13 11:34:46] denormalize_canonical_names_to_nodes
[START] [2023-10-13 11:34:46] match_nodes
[START] [2023-10-13 11:34:46] map_all_nodes_to_pages
[STOP] [2023-10-13 11:34:48] map_all_nodes_to_pages
[INFO] [2023-10-13 11:34:48] Unmatched nodes (8 of 59): Canonical: Diploria labrynthlforms; Node#137158706; ResourceID: Diploria_labrynthlforms; Canonical: Dichocoenia; Node#137158703; ResourceID: Dichocoenia; Canonical: Faviidae; Node#137158711; ResourceID: Animalia/Anthozoa/Faviidae; Canonical: Goniastrea rebformis; Node#137158713; ResourceID: Goniastrea_rebformis; Canonical: Madracis mirabilis; Node#137158718; ResourceID: Madracis_mirabilis; Canonical: Porites; Node#137158731; ResourceID: Porites; Canonical: Tubastrea aurea; Node#137158742; ResourceID: Tubastrea_aurea; Canonical: Tubastrea micranthus; Node#137158743; ResourceID: Tubastrea_micranthus
[START] [2023-10-13 11:34:48] update_nodes
[STOP] [2023-10-13 11:34:48] update_nodes
[STOP] [2023-10-13 11:34:48] match_nodes
[START] [2023-10-13 11:34:48] reindex_search
[STOP] [2023-10-13 11:34:48] reindex_search
[START] [2023-10-13 11:34:48] normalize_units
[STOP] [2023-10-13 11:34:49] normalize_units
[START] [2023-10-13 11:34:49] calculate_statistics
[INFO] [2023-10-13 11:34:49] Duplicate page_id count: 0
[STOP] [2023-10-13 11:34:49] calculate_statistics
[START] [2023-10-13 11:34:49] complete_harvest_instance
[START] [2023-10-13 11:34:49] overall_tsv_creation
[INFO] [2023-10-13 11:34:49] Exporting 59 nodes as TSV in batches of 10000...
[INFO] [2023-10-13 11:34:49] Processing group of 59 in 1 batches of 10000
[INFO] [2023-10-13 11:34:49] 95 Traits (unfiltered) and 0 associations...
[INFO] [2023-10-13 11:34:49] Building Traits map for 59 nodes (this can take a while)...
[INFO] [2023-10-13 11:34:49] Mapped 95 traits (122 meta) for 59 nodes.
[INFO] [2023-10-13 11:34:49] Building Associations map (this can take a while)...
[INFO] [2023-10-13 11:34:49] Done. 0 assocs mapped (0 meta).
[INFO] [2023-10-13 11:34:49] Adding 95 traits...
[INFO] [2023-10-13 11:34:50] 310 metadata added.
[INFO] [2023-10-13 11:34:50] Adding 0 assocs...
[INFO] [2023-10-13 11:34:50] 0 metadata added.
[INFO] [2023-10-13 11:35:34] Processed 59/59 nodes
[INFO] [2023-10-13 11:35:34] Average Time: 44.91
[INFO] [2023-10-13 11:35:34] Total Time: 45s
[STOP] [2023-10-13 11:35:34] overall_tsv_creation
[INFO] [2023-10-13 11:35:34] Done. Check your files:
[INFO] [2023-10-13 11:35:34] (59 lines) /app/public/data/Hughes/publish_nodes.tsv
[INFO] [2023-10-13 11:35:34] (194 lines) /app/public/data/Hughes/publish_node_ancestors.tsv
[INFO] [2023-10-13 11:35:34] (59 lines) /app/public/data/Hughes/publish_scientific_names.tsv
[INFO] [2023-10-13 11:35:34] (96 lines) /app/public/data/Hughes/publish_traits.tsv
[INFO] [2023-10-13 11:35:35] (311 lines) /app/public/data/Hughes/publish_metadata.tsv
[STOP] [2023-10-13 11:35:35] complete_harvest_instance
[START] [2023-10-13 11:35:35] completed
[STOP] [2023-10-13 11:35:35] completed
[STOP] [2023-10-13 11:35:35] logged process, took 61.51

Latest Process