Harvest for Pereyra et al 2016 Created 14 Oct 08:28

Stage: completed
Fetched: 14 Oct 08:28
Validated: 14 Oct 08:28
Deltas Created 14 Oct 08:28
Units Normalized: 14 Oct 08:28
Ancestry Built: 14 Oct 08:28
Nodes Matched: 14 Oct 08:28
Names Parsed: 14 Oct 08:28
New Models Stored: 14 Oct 08:28
Indexed: 14 Oct 08:28
Completed: 14 Oct 08:29
Time to Harvest: less than a minute

Harvesting Log

(159 lines)
[INFO] [2023-10-14 08:28:26] Created harvest instance #4469
[STOP] [2023-10-14 08:28:26] create_harvest_instance
[START] [2023-10-14 08:28:26] fetch_files
[STOP] [2023-10-14 08:28:26] fetch_files
[START] [2023-10-14 08:28:26] validate_each_file
[INFO] [2023-10-14 08:28:26] Looping over 4 formats...
[INFO] [2023-10-14 08:28:26] ...refs (/app/public/data/pereyra_et_al_pe/references.txt)
[INFO] [2023-10-14 08:28:26] Valid: /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_refs_30884.csv (1 lines)
[INFO] [2023-10-14 08:28:26] ...nodes (/app/public/data/pereyra_et_al_pe/taxa.txt)
[INFO] [2023-10-14 08:28:26] Valid: /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_nodes_30881.csv (421 lines)
[INFO] [2023-10-14 08:28:26] ...occurrences (/app/public/data/pereyra_et_al_pe/occurrences.txt)
[INFO] [2023-10-14 08:28:26] Valid: /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_occurrences_30882.csv (421 lines)
[INFO] [2023-10-14 08:28:26] ...measurements (/app/public/data/pereyra_et_al_pe/measurementsorfacts.txt)
[INFO] [2023-10-14 08:28:26] Valid: /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_measurements_30883.csv (896 lines)
[STOP] [2023-10-14 08:28:26] validate_each_file
[START] [2023-10-14 08:28:26] convert_to_csv
[INFO] [2023-10-14 08:28:26] Looping over 4 formats...
[INFO] [2023-10-14 08:28:26] ...refs (/app/public/data/pereyra_et_al_pe/references.txt)
[CMD] [2023-10-14 08:28:26] /usr/bin/sort /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_refs_30884.csv > /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_refs_30884.csv_sorted
[INFO] [2023-10-14 08:28:27] Converted: /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_refs_30884.csv (1 lines)
[INFO] [2023-10-14 08:28:27] ...nodes (/app/public/data/pereyra_et_al_pe/taxa.txt)
[CMD] [2023-10-14 08:28:27] /usr/bin/sort /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_nodes_30881.csv > /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_nodes_30881.csv_sorted
[INFO] [2023-10-14 08:28:27] Converted: /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_nodes_30881.csv (421 lines)
[INFO] [2023-10-14 08:28:27] ...occurrences (/app/public/data/pereyra_et_al_pe/occurrences.txt)
[CMD] [2023-10-14 08:28:27] /usr/bin/sort /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_occurrences_30882.csv > /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_occurrences_30882.csv_sorted
[INFO] [2023-10-14 08:28:27] Converted: /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_occurrences_30882.csv (421 lines)
[INFO] [2023-10-14 08:28:27] ...measurements (/app/public/data/pereyra_et_al_pe/measurementsorfacts.txt)
[CMD] [2023-10-14 08:28:27] /usr/bin/sort /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_measurements_30883.csv > /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_measurements_30883.csv_sorted
[INFO] [2023-10-14 08:28:27] Converted: /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_measurements_30883.csv (896 lines)
[STOP] [2023-10-14 08:28:27] convert_to_csv
[START] [2023-10-14 08:28:27] calculate_delta
[INFO] [2023-10-14 08:28:27] Looping over 4 formats...
[INFO] [2023-10-14 08:28:27] ...refs (/app/public/data/pereyra_et_al_pe/references.txt)
[CMD] [2023-10-14 08:28:27] echo "0a" > /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_refs_30884.diff
[CMD] [2023-10-14 08:28:27] tail -n +1 /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_refs_30884.csv >> /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_refs_30884.diff
[CMD] [2023-10-14 08:28:27] echo "." >> /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_refs_30884.diff
[INFO] [2023-10-14 08:28:27] Created diff: /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_refs_30884.diff (3 lines)
[INFO] [2023-10-14 08:28:27] ...nodes (/app/public/data/pereyra_et_al_pe/taxa.txt)
[CMD] [2023-10-14 08:28:27] echo "0a" > /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_nodes_30881.diff
[CMD] [2023-10-14 08:28:27] tail -n +1 /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_nodes_30881.csv >> /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_nodes_30881.diff
[CMD] [2023-10-14 08:28:27] echo "." >> /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_nodes_30881.diff
[INFO] [2023-10-14 08:28:27] Created diff: /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_nodes_30881.diff (423 lines)
[INFO] [2023-10-14 08:28:27] ...occurrences (/app/public/data/pereyra_et_al_pe/occurrences.txt)
[CMD] [2023-10-14 08:28:27] echo "0a" > /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_occurrences_30882.diff
[CMD] [2023-10-14 08:28:27] tail -n +1 /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_occurrences_30882.csv >> /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_occurrences_30882.diff
[CMD] [2023-10-14 08:28:27] echo "." >> /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_occurrences_30882.diff
[INFO] [2023-10-14 08:28:27] Created diff: /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_occurrences_30882.diff (423 lines)
[INFO] [2023-10-14 08:28:27] ...measurements (/app/public/data/pereyra_et_al_pe/measurementsorfacts.txt)
[CMD] [2023-10-14 08:28:27] echo "0a" > /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_measurements_30883.diff
[CMD] [2023-10-14 08:28:28] tail -n +1 /app/public/data/pereyra_et_al_pe/converted_csv/pereyra_et_al_pe_measurements_30883.csv >> /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_measurements_30883.diff
[CMD] [2023-10-14 08:28:28] echo "." >> /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_measurements_30883.diff
[INFO] [2023-10-14 08:28:28] Created diff: /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_measurements_30883.diff (898 lines)
[STOP] [2023-10-14 08:28:28] calculate_delta
[START] [2023-10-14 08:28:28] parse_diff_and_store
[INFO] [2023-10-14 08:28:28] Handling diff: /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_refs_30884.diff (3 lines)
[INFO] [2023-10-14 08:28:28] Loading refs diff file into memory (3 lines)...
[INFO] [2023-10-14 08:28:28] Storing 1 References (1/1/3)
[INFO] [2023-10-14 08:28:28] Handling diff: /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_nodes_30881.diff (423 lines)
[INFO] [2023-10-14 08:28:28] Loading nodes diff file into memory (423 lines)...
[INFO] [2023-10-14 08:28:28] Storing 421 ScientificNames (842/421/423)
[INFO] [2023-10-14 08:28:28] Storing 421 Nodes (842/421/423)
[INFO] [2023-10-14 08:28:28] Handling diff: /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_occurrences_30882.diff (423 lines)
[INFO] [2023-10-14 08:28:28] Loading occurrences diff file into memory (423 lines)...
[INFO] [2023-10-14 08:28:28] Storing 421 Occurrences (842/421/423)
[INFO] [2023-10-14 08:28:28] Storing 421 OccurrenceMetadata (842/421/423)
[INFO] [2023-10-14 08:28:29] Handling diff: /app/public/data/pereyra_et_al_pe/diff/pereyra_et_al_pe_measurements_30883.diff (898 lines)
[INFO] [2023-10-14 08:28:29] Loading measurements diff file into memory (898 lines)...
[INFO] [2023-10-14 08:28:29] Storing 896 Traits (2106/896/898)
[INFO] [2023-10-14 08:28:30] Storing 392 TraitsReferences (2106/896/898)
[INFO] [2023-10-14 08:28:30] Storing 818 MetaTraits (2106/896/898)
[STOP] [2023-10-14 08:28:30] parse_diff_and_store
[START] [2023-10-14 08:28:30] resolve_keys
[2023-10-14 08:28:30] Resolving downloaded urls (this is not actually downloading them yet)
[INFO] [2023-10-14 08:28:38] Occurrences to nodes (through scientific_names)...
[INFO] [2023-10-14 08:28:38] traits to occurrences...
[INFO] [2023-10-14 08:28:38] traits to nodes (through occurrences)...
[INFO] [2023-10-14 08:28:38] Traits to sex term...
[INFO] [2023-10-14 08:28:38] Traits to lifestage term...
[INFO] [2023-10-14 08:28:38] MetaTraits to traits...
[INFO] [2023-10-14 08:28:38] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2023-10-14 08:28:38] Assocs to occurrences...
[INFO] [2023-10-14 08:28:38] Assocs to nodes...
[INFO] [2023-10-14 08:28:38] Assoc to sex term...
[INFO] [2023-10-14 08:28:38] Assoc to lifestage term...
[INFO] [2023-10-14 08:28:38] MetaAssoc to assocs...
[STOP] [2023-10-14 08:28:38] resolve_keys
[START] [2023-10-14 08:28:38] hold_for_later_1
[STOP] [2023-10-14 08:28:38] hold_for_later_1
[START] [2023-10-14 08:28:38] hold_for_later_2
[STOP] [2023-10-14 08:28:38] hold_for_later_2
[START] [2023-10-14 08:28:38] resolve_missing_parents
[STOP] [2023-10-14 08:28:38] resolve_missing_parents
[START] [2023-10-14 08:28:38] rebuild_nodes
[START] [2023-10-14 08:28:38] Flattener#flatten
[START] [2023-10-14 08:28:38] Flattener#study_resource
[START] [2023-10-14 08:28:38] Flattener#build_ancestry
[STOP] [2023-10-14 08:28:38] Flattener#build_ancestry
[INFO] [2023-10-14 08:28:38] 421 ancestry keys
[START] [2023-10-14 08:28:38] build_node_ancestors
[INFO] [2023-10-14 08:28:38] old ancestors deleted.
[STOP] [2023-10-14 08:28:38] build_node_ancestors
[WARN] [2023-10-14 08:28:38] Flattener: nothing to flatten! (Completely flat resource?)
[STOP] [2023-10-14 08:28:38] Flattener#flatten
[STOP] [2023-10-14 08:28:38] rebuild_nodes
[START] [2023-10-14 08:28:38] resolve_missing_media_owners
[STOP] [2023-10-14 08:28:38] resolve_missing_media_owners
[START] [2023-10-14 08:28:38] sanitize_media_verbatims
[STOP] [2023-10-14 08:28:38] sanitize_media_verbatims
[START] [2023-10-14 08:28:38] queue_downloads
[STOP] [2023-10-14 08:28:38] queue_downloads
[START] [2023-10-14 08:28:38] parse_names
[WARN] [2023-10-14 08:28:38] I see 421 names which still need to be parsed.
[WARN] [2023-10-14 08:28:38] Names to parse: 421 formatted: 421 learned: 421 parsed: 421
[STOP] [2023-10-14 08:28:39] parse_names
[START] [2023-10-14 08:28:39] denormalize_canonical_names_to_nodes
[STOP] [2023-10-14 08:28:39] denormalize_canonical_names_to_nodes
[START] [2023-10-14 08:28:39] match_nodes
[START] [2023-10-14 08:28:39] map_all_nodes_to_pages
[STOP] [2023-10-14 08:28:40] map_all_nodes_to_pages
[INFO] [2023-10-14 08:28:40] ZERO unmatched nodes (of 421)! Nicely done.
[START] [2023-10-14 08:28:40] update_nodes
[STOP] [2023-10-14 08:28:40] update_nodes
[STOP] [2023-10-14 08:28:40] match_nodes
[START] [2023-10-14 08:28:40] reindex_search
[STOP] [2023-10-14 08:28:40] reindex_search
[START] [2023-10-14 08:28:40] normalize_units
[STOP] [2023-10-14 08:28:40] normalize_units
[START] [2023-10-14 08:28:40] calculate_statistics
[2023-10-14 08:28:40] ZERO NODE ANCESTORS. Is this actually a completely flat resource?
[INFO] [2023-10-14 08:28:45] Duplicate page_id count: 0
[STOP] [2023-10-14 08:28:45] calculate_statistics
[START] [2023-10-14 08:28:45] complete_harvest_instance
[START] [2023-10-14 08:28:45] overall_tsv_creation
[INFO] [2023-10-14 08:28:45] Exporting 421 nodes as TSV in batches of 10000...
[INFO] [2023-10-14 08:28:45] Processing group of 421 in 1 batches of 10000
[INFO] [2023-10-14 08:28:46] 426 Traits (unfiltered) and 0 associations...
[INFO] [2023-10-14 08:28:46] Building Traits map for 421 nodes (this can take a while)...
[INFO] [2023-10-14 08:28:46] Mapped 426 traits (818 meta) for 421 nodes.
[INFO] [2023-10-14 08:28:46] Building Associations map (this can take a while)...
[INFO] [2023-10-14 08:28:46] Done. 0 assocs mapped (0 meta).
[INFO] [2023-10-14 08:28:46] Adding 426 traits...
[INFO] [2023-10-14 08:28:46] Trait #291667990 in key 291667990 has 41 metadata... that seems high?
[INFO] [2023-10-14 08:28:46] Trait #291668145 in key 291668145 has 340 metadata... that seems high?
[INFO] [2023-10-14 08:28:46] 862 metadata added.
[INFO] [2023-10-14 08:28:46] Adding 0 assocs...
[INFO] [2023-10-14 08:28:46] 0 metadata added.
[INFO] [2023-10-14 08:29:30] Processed 421/421 nodes
[INFO] [2023-10-14 08:29:30] Average Time: 45.01
[INFO] [2023-10-14 08:29:30] Total Time: 46s
[STOP] [2023-10-14 08:29:30] overall_tsv_creation
[INFO] [2023-10-14 08:29:30] Done. Check your files:
[INFO] [2023-10-14 08:29:30] (421 lines) /app/public/data/pereyra_et_al_pe/publish_nodes.tsv
[INFO] [2023-10-14 08:29:31] (421 lines) /app/public/data/pereyra_et_al_pe/publish_scientific_names.tsv
[INFO] [2023-10-14 08:29:31] (427 lines) /app/public/data/pereyra_et_al_pe/publish_traits.tsv
[INFO] [2023-10-14 08:29:31] (863 lines) /app/public/data/pereyra_et_al_pe/publish_metadata.tsv
[STOP] [2023-10-14 08:29:31] complete_harvest_instance
[START] [2023-10-14 08:29:31] completed
[STOP] [2023-10-14 08:29:31] completed
[STOP] [2023-10-14 08:29:31] logged process, took 64.51

Latest Process