Harvest for Schilthuizen and Davison 2005 Created 13 Oct 09:19

Stage: completed
Fetched: 13 Oct 09:19
Validated: 13 Oct 09:19
Deltas Created 13 Oct 09:19
Units Normalized: 13 Oct 09:19
Ancestry Built: 13 Oct 09:19
Nodes Matched: 13 Oct 09:19
Names Parsed: 13 Oct 09:19
New Models Stored: 13 Oct 09:19
Indexed: 13 Oct 09:19
Completed: 13 Oct 09:20
Time to Harvest: less than a minute

Harvesting Log

(158 lines)
[INFO] [2023-10-13 09:19:41] Created harvest instance #4418
[STOP] [2023-10-13 09:19:41] create_harvest_instance
[START] [2023-10-13 09:19:41] fetch_files
[STOP] [2023-10-13 09:19:41] fetch_files
[START] [2023-10-13 09:19:41] validate_each_file
[INFO] [2023-10-13 09:19:41] Looping over 4 formats...
[INFO] [2023-10-13 09:19:41] ...refs (/app/public/data/schilthuizen_dav/references.txt)
[INFO] [2023-10-13 09:19:41] Valid: /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_refs_30624.csv (4 lines)
[INFO] [2023-10-13 09:19:41] ...nodes (/app/public/data/schilthuizen_dav/taxa.txt)
[INFO] [2023-10-13 09:19:41] Valid: /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_nodes_30621.csv (25 lines)
[INFO] [2023-10-13 09:19:41] ...occurrences (/app/public/data/schilthuizen_dav/occurrences.txt)
[INFO] [2023-10-13 09:19:41] Valid: /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_occurrences_30622.csv (5 lines)
[INFO] [2023-10-13 09:19:41] ...measurements (/app/public/data/schilthuizen_dav/measurementOrFact.txt)
[INFO] [2023-10-13 09:19:41] Valid: /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_measurements_30623.csv (34 lines)
[STOP] [2023-10-13 09:19:41] validate_each_file
[START] [2023-10-13 09:19:41] convert_to_csv
[INFO] [2023-10-13 09:19:41] Looping over 4 formats...
[INFO] [2023-10-13 09:19:41] ...refs (/app/public/data/schilthuizen_dav/references.txt)
[CMD] [2023-10-13 09:19:41] /usr/bin/sort /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_refs_30624.csv > /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_refs_30624.csv_sorted
[INFO] [2023-10-13 09:19:41] Converted: /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_refs_30624.csv (4 lines)
[INFO] [2023-10-13 09:19:41] ...nodes (/app/public/data/schilthuizen_dav/taxa.txt)
[CMD] [2023-10-13 09:19:41] /usr/bin/sort /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_nodes_30621.csv > /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_nodes_30621.csv_sorted
[INFO] [2023-10-13 09:19:41] Converted: /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_nodes_30621.csv (25 lines)
[INFO] [2023-10-13 09:19:41] ...occurrences (/app/public/data/schilthuizen_dav/occurrences.txt)
[CMD] [2023-10-13 09:19:41] /usr/bin/sort /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_occurrences_30622.csv > /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_occurrences_30622.csv_sorted
[INFO] [2023-10-13 09:19:41] Converted: /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_occurrences_30622.csv (5 lines)
[INFO] [2023-10-13 09:19:41] ...measurements (/app/public/data/schilthuizen_dav/measurementOrFact.txt)
[CMD] [2023-10-13 09:19:41] /usr/bin/sort /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_measurements_30623.csv > /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_measurements_30623.csv_sorted
[INFO] [2023-10-13 09:19:41] Converted: /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_measurements_30623.csv (34 lines)
[STOP] [2023-10-13 09:19:41] convert_to_csv
[START] [2023-10-13 09:19:41] calculate_delta
[INFO] [2023-10-13 09:19:41] Looping over 4 formats...
[INFO] [2023-10-13 09:19:41] ...refs (/app/public/data/schilthuizen_dav/references.txt)
[CMD] [2023-10-13 09:19:41] echo "0a" > /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_refs_30624.diff
[CMD] [2023-10-13 09:19:41] tail -n +1 /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_refs_30624.csv >> /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_refs_30624.diff
[CMD] [2023-10-13 09:19:41] echo "." >> /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_refs_30624.diff
[INFO] [2023-10-13 09:19:42] Created diff: /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_refs_30624.diff (6 lines)
[INFO] [2023-10-13 09:19:42] ...nodes (/app/public/data/schilthuizen_dav/taxa.txt)
[CMD] [2023-10-13 09:19:42] echo "0a" > /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_nodes_30621.diff
[CMD] [2023-10-13 09:19:42] tail -n +1 /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_nodes_30621.csv >> /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_nodes_30621.diff
[CMD] [2023-10-13 09:19:42] echo "." >> /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_nodes_30621.diff
[INFO] [2023-10-13 09:19:42] Created diff: /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_nodes_30621.diff (27 lines)
[INFO] [2023-10-13 09:19:42] ...occurrences (/app/public/data/schilthuizen_dav/occurrences.txt)
[CMD] [2023-10-13 09:19:42] echo "0a" > /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_occurrences_30622.diff
[CMD] [2023-10-13 09:19:42] tail -n +1 /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_occurrences_30622.csv >> /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_occurrences_30622.diff
[CMD] [2023-10-13 09:19:42] echo "." >> /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_occurrences_30622.diff
[INFO] [2023-10-13 09:19:42] Created diff: /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_occurrences_30622.diff (7 lines)
[INFO] [2023-10-13 09:19:42] ...measurements (/app/public/data/schilthuizen_dav/measurementOrFact.txt)
[CMD] [2023-10-13 09:19:42] echo "0a" > /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_measurements_30623.diff
[CMD] [2023-10-13 09:19:42] tail -n +1 /app/public/data/schilthuizen_dav/converted_csv/schilthuizen_dav_measurements_30623.csv >> /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_measurements_30623.diff
[CMD] [2023-10-13 09:19:42] echo "." >> /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_measurements_30623.diff
[INFO] [2023-10-13 09:19:42] Created diff: /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_measurements_30623.diff (36 lines)
[STOP] [2023-10-13 09:19:42] calculate_delta
[START] [2023-10-13 09:19:42] parse_diff_and_store
[INFO] [2023-10-13 09:19:42] Handling diff: /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_refs_30624.diff (6 lines)
[INFO] [2023-10-13 09:19:42] Loading refs diff file into memory (6 lines)...
[INFO] [2023-10-13 09:19:42] Storing 4 References (4/4/6)
[INFO] [2023-10-13 09:19:42] Handling diff: /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_nodes_30621.diff (27 lines)
[INFO] [2023-10-13 09:19:42] Loading nodes diff file into memory (27 lines)...
[INFO] [2023-10-13 09:19:42] Storing 25 ScientificNames (50/25/27)
[INFO] [2023-10-13 09:19:42] Storing 25 Nodes (50/25/27)
[INFO] [2023-10-13 09:19:42] Handling diff: /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_occurrences_30622.diff (7 lines)
[INFO] [2023-10-13 09:19:43] Loading occurrences diff file into memory (7 lines)...
[INFO] [2023-10-13 09:19:43] Storing 5 Occurrences (10/5/7)
[INFO] [2023-10-13 09:19:43] Storing 5 OccurrenceMetadata (10/5/7)
[INFO] [2023-10-13 09:19:43] Handling diff: /app/public/data/schilthuizen_dav/diff/schilthuizen_dav_measurements_30623.diff (36 lines)
[INFO] [2023-10-13 09:19:43] Loading measurements diff file into memory (36 lines)...
[INFO] [2023-10-13 09:19:43] Storing 4 TraitsReferences (43/34/36)
[INFO] [2023-10-13 09:19:43] Storing 34 Traits (43/34/36)
[INFO] [2023-10-13 09:19:43] Storing 5 MetaTraits (43/34/36)
[STOP] [2023-10-13 09:19:43] parse_diff_and_store
[START] [2023-10-13 09:19:43] resolve_keys
[2023-10-13 09:19:43] Resolving downloaded urls (this is not actually downloading them yet)
[INFO] [2023-10-13 09:19:51] Occurrences to nodes (through scientific_names)...
[INFO] [2023-10-13 09:19:51] traits to occurrences...
[INFO] [2023-10-13 09:19:51] traits to nodes (through occurrences)...
[INFO] [2023-10-13 09:19:51] Traits to sex term...
[INFO] [2023-10-13 09:19:51] Traits to lifestage term...
[INFO] [2023-10-13 09:19:51] MetaTraits to traits...
[INFO] [2023-10-13 09:19:51] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2023-10-13 09:19:51] Assocs to occurrences...
[INFO] [2023-10-13 09:19:51] Assocs to nodes...
[INFO] [2023-10-13 09:19:51] Assoc to sex term...
[INFO] [2023-10-13 09:19:51] Assoc to lifestage term...
[INFO] [2023-10-13 09:19:51] MetaAssoc to assocs...
[STOP] [2023-10-13 09:19:51] resolve_keys
[START] [2023-10-13 09:19:51] hold_for_later_1
[STOP] [2023-10-13 09:19:51] hold_for_later_1
[START] [2023-10-13 09:19:51] hold_for_later_2
[STOP] [2023-10-13 09:19:51] hold_for_later_2
[START] [2023-10-13 09:19:51] resolve_missing_parents
[STOP] [2023-10-13 09:19:51] resolve_missing_parents
[START] [2023-10-13 09:19:51] rebuild_nodes
[START] [2023-10-13 09:19:51] Flattener#flatten
[START] [2023-10-13 09:19:51] Flattener#study_resource
[START] [2023-10-13 09:19:51] Flattener#build_ancestry
[STOP] [2023-10-13 09:19:51] Flattener#build_ancestry
[INFO] [2023-10-13 09:19:51] 25 ancestry keys
[START] [2023-10-13 09:19:51] build_node_ancestors
[INFO] [2023-10-13 09:19:51] old ancestors deleted.
[STOP] [2023-10-13 09:19:51] build_node_ancestors
[WARN] [2023-10-13 09:19:51] Flattener: nothing to flatten! (Completely flat resource?)
[STOP] [2023-10-13 09:19:51] Flattener#flatten
[STOP] [2023-10-13 09:19:51] rebuild_nodes
[START] [2023-10-13 09:19:51] resolve_missing_media_owners
[STOP] [2023-10-13 09:19:51] resolve_missing_media_owners
[START] [2023-10-13 09:19:51] sanitize_media_verbatims
[STOP] [2023-10-13 09:19:51] sanitize_media_verbatims
[START] [2023-10-13 09:19:51] queue_downloads
[STOP] [2023-10-13 09:19:51] queue_downloads
[START] [2023-10-13 09:19:51] parse_names
[WARN] [2023-10-13 09:19:51] I see 25 names which still need to be parsed.
[WARN] [2023-10-13 09:19:51] Names to parse: 25 formatted: 25 learned: 25 parsed: 25
[STOP] [2023-10-13 09:19:52] parse_names
[START] [2023-10-13 09:19:52] denormalize_canonical_names_to_nodes
[STOP] [2023-10-13 09:19:52] denormalize_canonical_names_to_nodes
[START] [2023-10-13 09:19:52] match_nodes
[START] [2023-10-13 09:19:52] map_all_nodes_to_pages
[STOP] [2023-10-13 09:19:52] map_all_nodes_to_pages
[INFO] [2023-10-13 09:19:52] ZERO unmatched nodes (of 25)! Nicely done.
[START] [2023-10-13 09:19:52] update_nodes
[STOP] [2023-10-13 09:19:52] update_nodes
[STOP] [2023-10-13 09:19:52] match_nodes
[START] [2023-10-13 09:19:52] reindex_search
[STOP] [2023-10-13 09:19:52] reindex_search
[START] [2023-10-13 09:19:52] normalize_units
[STOP] [2023-10-13 09:19:52] normalize_units
[START] [2023-10-13 09:19:52] calculate_statistics
[2023-10-13 09:19:52] ZERO NODE ANCESTORS. Is this actually a completely flat resource?
[INFO] [2023-10-13 09:19:52] Duplicate page_id count: 0
[STOP] [2023-10-13 09:19:52] calculate_statistics
[START] [2023-10-13 09:19:52] complete_harvest_instance
[START] [2023-10-13 09:19:52] overall_tsv_creation
[INFO] [2023-10-13 09:19:52] Exporting 25 nodes as TSV in batches of 10000...
[INFO] [2023-10-13 09:19:52] Processing group of 25 in 1 batches of 10000
[INFO] [2023-10-13 09:19:52] 5 Traits (unfiltered) and 0 associations...
[INFO] [2023-10-13 09:19:52] Building Traits map for 25 nodes (this can take a while)...
[INFO] [2023-10-13 09:19:52] Mapped 5 traits (5 meta) for 25 nodes.
[INFO] [2023-10-13 09:19:52] Building Associations map (this can take a while)...
[INFO] [2023-10-13 09:19:52] Done. 0 assocs mapped (0 meta).
[INFO] [2023-10-13 09:19:52] Adding 5 traits...
[INFO] [2023-10-13 09:19:52] Trait #291092917 in key 291092917 has 28 metadata... that seems high?
[INFO] [2023-10-13 09:19:52] 33 metadata added.
[INFO] [2023-10-13 09:19:52] Adding 0 assocs...
[INFO] [2023-10-13 09:19:52] 0 metadata added.
[INFO] [2023-10-13 09:20:37] Processed 25/25 nodes
[INFO] [2023-10-13 09:20:37] Average Time: 44.85
[INFO] [2023-10-13 09:20:37] Total Time: 45s
[STOP] [2023-10-13 09:20:37] overall_tsv_creation
[INFO] [2023-10-13 09:20:37] Done. Check your files:
[INFO] [2023-10-13 09:20:37] (25 lines) /app/public/data/schilthuizen_dav/publish_nodes.tsv
[INFO] [2023-10-13 09:20:37] (25 lines) /app/public/data/schilthuizen_dav/publish_scientific_names.tsv
[INFO] [2023-10-13 09:20:37] (6 lines) /app/public/data/schilthuizen_dav/publish_traits.tsv
[INFO] [2023-10-13 09:20:37] (34 lines) /app/public/data/schilthuizen_dav/publish_metadata.tsv
[STOP] [2023-10-13 09:20:37] complete_harvest_instance
[START] [2023-10-13 09:20:37] completed
[STOP] [2023-10-13 09:20:37] completed
[STOP] [2023-10-13 09:20:37] logged process, took 56.67

Latest Process