Harvest for
Hotchkiss 2000
Created
25 Aug 16:49
Stage:
completed
Fetched:
25 Aug 16:49
Validated:
25 Aug 16:49
Deltas Created
25 Aug 16:49
Units Normalized:
25 Aug 16:49
Ancestry Built:
25 Aug 16:49
Nodes Matched:
25 Aug 16:49
Names Parsed:
25 Aug 16:49
New Models Stored:
25 Aug 16:49
Indexed:
25 Aug 16:49
Completed:
25 Aug 16:53
Time to Harvest:
less than a minute
Harvesting Log
(141 lines)
[INFO] [2022-08-25 16:49:01] Created harvest instance #4205
[STOP] [2022-08-25 16:49:01] create_harvest_instance
[START] [2022-08-25 16:49:01] fetch_files
[STOP] [2022-08-25 16:49:01] fetch_files
[START] [2022-08-25 16:49:01] validate_each_file
[INFO] [2022-08-25 16:49:01] Looping over 3 formats...
[INFO] [2022-08-25 16:49:01] ...nodes (/app/public/data/hotchkiss_hotchk/taxa.txt)
[INFO] [2022-08-25 16:49:01] Valid: /app/public/data/hotchkiss_hotchk/converted_csv/hotchkiss_hotchk_nodes_29677.csv (14 lines)
[INFO] [2022-08-25 16:49:01] ...occurrences (/app/public/data/hotchkiss_hotchk/occurrences.txt)
[INFO] [2022-08-25 16:49:01] Valid: /app/public/data/hotchkiss_hotchk/converted_csv/hotchkiss_hotchk_occurrences_29678.csv (14 lines)
[INFO] [2022-08-25 16:49:01] ...measurements (/app/public/data/hotchkiss_hotchk/measurementOrFact.txt)
[INFO] [2022-08-25 16:49:01] Valid: /app/public/data/hotchkiss_hotchk/converted_csv/hotchkiss_hotchk_measurements_29679.csv (27 lines)
[STOP] [2022-08-25 16:49:01] validate_each_file
[START] [2022-08-25 16:49:01] convert_to_csv
[INFO] [2022-08-25 16:49:01] Looping over 3 formats...
[INFO] [2022-08-25 16:49:01] ...nodes (/app/public/data/hotchkiss_hotchk/taxa.txt)
[CMD] [2022-08-25 16:49:01] /usr/bin/sort /app/public/data/hotchkiss_hotchk/converted_csv/hotchkiss_hotchk_nodes_29677.csv > /app/public/data/hotchkiss_hotchk/converted_csv/hotchkiss_hotchk_nodes_29677.csv_sorted
[INFO] [2022-08-25 16:49:01] Converted: /app/public/data/hotchkiss_hotchk/converted_csv/hotchkiss_hotchk_nodes_29677.csv (14 lines)
[INFO] [2022-08-25 16:49:01] ...occurrences (/app/public/data/hotchkiss_hotchk/occurrences.txt)
[CMD] [2022-08-25 16:49:01] /usr/bin/sort /app/public/data/hotchkiss_hotchk/converted_csv/hotchkiss_hotchk_occurrences_29678.csv > /app/public/data/hotchkiss_hotchk/converted_csv/hotchkiss_hotchk_occurrences_29678.csv_sorted
[INFO] [2022-08-25 16:49:01] Converted: /app/public/data/hotchkiss_hotchk/converted_csv/hotchkiss_hotchk_occurrences_29678.csv (14 lines)
[INFO] [2022-08-25 16:49:01] ...measurements (/app/public/data/hotchkiss_hotchk/measurementOrFact.txt)
[CMD] [2022-08-25 16:49:01] /usr/bin/sort /app/public/data/hotchkiss_hotchk/converted_csv/hotchkiss_hotchk_measurements_29679.csv > /app/public/data/hotchkiss_hotchk/converted_csv/hotchkiss_hotchk_measurements_29679.csv_sorted
[INFO] [2022-08-25 16:49:01] Converted: /app/public/data/hotchkiss_hotchk/converted_csv/hotchkiss_hotchk_measurements_29679.csv (27 lines)
[STOP] [2022-08-25 16:49:01] convert_to_csv
[START] [2022-08-25 16:49:01] calculate_delta
[INFO] [2022-08-25 16:49:01] Looping over 3 formats...
[INFO] [2022-08-25 16:49:01] ...nodes (/app/public/data/hotchkiss_hotchk/taxa.txt)
[CMD] [2022-08-25 16:49:01] echo "0a" > /app/public/data/hotchkiss_hotchk/diff/hotchkiss_hotchk_nodes_29677.diff
[CMD] [2022-08-25 16:49:01] tail -n +1 /app/public/data/hotchkiss_hotchk/converted_csv/hotchkiss_hotchk_nodes_29677.csv >> /app/public/data/hotchkiss_hotchk/diff/hotchkiss_hotchk_nodes_29677.diff
[CMD] [2022-08-25 16:49:01] echo "." >> /app/public/data/hotchkiss_hotchk/diff/hotchkiss_hotchk_nodes_29677.diff
[INFO] [2022-08-25 16:49:01] Created diff: /app/public/data/hotchkiss_hotchk/diff/hotchkiss_hotchk_nodes_29677.diff (16 lines)
[INFO] [2022-08-25 16:49:01] ...occurrences (/app/public/data/hotchkiss_hotchk/occurrences.txt)
[CMD] [2022-08-25 16:49:01] echo "0a" > /app/public/data/hotchkiss_hotchk/diff/hotchkiss_hotchk_occurrences_29678.diff
[CMD] [2022-08-25 16:49:01] tail -n +1 /app/public/data/hotchkiss_hotchk/converted_csv/hotchkiss_hotchk_occurrences_29678.csv >> /app/public/data/hotchkiss_hotchk/diff/hotchkiss_hotchk_occurrences_29678.diff
[CMD] [2022-08-25 16:49:01] echo "." >> /app/public/data/hotchkiss_hotchk/diff/hotchkiss_hotchk_occurrences_29678.diff
[INFO] [2022-08-25 16:49:01] Created diff: /app/public/data/hotchkiss_hotchk/diff/hotchkiss_hotchk_occurrences_29678.diff (16 lines)
[INFO] [2022-08-25 16:49:01] ...measurements (/app/public/data/hotchkiss_hotchk/measurementOrFact.txt)
[CMD] [2022-08-25 16:49:01] echo "0a" > /app/public/data/hotchkiss_hotchk/diff/hotchkiss_hotchk_measurements_29679.diff
[CMD] [2022-08-25 16:49:01] tail -n +1 /app/public/data/hotchkiss_hotchk/converted_csv/hotchkiss_hotchk_measurements_29679.csv >> /app/public/data/hotchkiss_hotchk/diff/hotchkiss_hotchk_measurements_29679.diff
[CMD] [2022-08-25 16:49:01] echo "." >> /app/public/data/hotchkiss_hotchk/diff/hotchkiss_hotchk_measurements_29679.diff
[INFO] [2022-08-25 16:49:01] Created diff: /app/public/data/hotchkiss_hotchk/diff/hotchkiss_hotchk_measurements_29679.diff (29 lines)
[STOP] [2022-08-25 16:49:01] calculate_delta
[START] [2022-08-25 16:49:01] parse_diff_and_store
[INFO] [2022-08-25 16:49:01] Handling diff: /app/public/data/hotchkiss_hotchk/diff/hotchkiss_hotchk_nodes_29677.diff (16 lines)
[INFO] [2022-08-25 16:49:01] Loading nodes diff file into memory (16 lines)...
[INFO] [2022-08-25 16:49:01] Storing 14 ScientificNames (28/14/16)
[INFO] [2022-08-25 16:49:01] Storing 14 Nodes (28/14/16)
[INFO] [2022-08-25 16:49:01] Handling diff: /app/public/data/hotchkiss_hotchk/diff/hotchkiss_hotchk_occurrences_29678.diff (16 lines)
[INFO] [2022-08-25 16:49:01] Loading occurrences diff file into memory (16 lines)...
[INFO] [2022-08-25 16:49:01] Storing 14 Occurrences (28/14/16)
[INFO] [2022-08-25 16:49:01] Storing 14 OccurrenceMetadata (28/14/16)
[INFO] [2022-08-25 16:49:01] Handling diff: /app/public/data/hotchkiss_hotchk/diff/hotchkiss_hotchk_measurements_29679.diff (29 lines)
[INFO] [2022-08-25 16:49:01] Loading measurements diff file into memory (29 lines)...
[INFO] [2022-08-25 16:49:01] Storing 27 Traits (41/27/29)
[INFO] [2022-08-25 16:49:01] Storing 14 MetaTraits (41/27/29)
[STOP] [2022-08-25 16:49:01] parse_diff_and_store
[START] [2022-08-25 16:49:01] resolve_keys
[2022-08-25 16:49:01] Resolving downloaded urls (this is not actually downloading them yet)
[INFO] [2022-08-25 16:49:09] Occurrences to nodes (through scientific_names)...
[INFO] [2022-08-25 16:49:09] traits to occurrences...
[INFO] [2022-08-25 16:49:09] traits to nodes (through occurrences)...
[INFO] [2022-08-25 16:49:09] Traits to sex term...
[INFO] [2022-08-25 16:49:09] Traits to lifestage term...
[INFO] [2022-08-25 16:49:09] MetaTraits to traits...
[INFO] [2022-08-25 16:49:09] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2022-08-25 16:49:09] Assocs to occurrences...
[INFO] [2022-08-25 16:49:09] Assocs to nodes...
[INFO] [2022-08-25 16:49:09] Assoc to sex term...
[INFO] [2022-08-25 16:49:09] Assoc to lifestage term...
[INFO] [2022-08-25 16:49:09] MetaAssoc to assocs...
[STOP] [2022-08-25 16:49:09] resolve_keys
[START] [2022-08-25 16:49:09] hold_for_later_1
[STOP] [2022-08-25 16:49:09] hold_for_later_1
[START] [2022-08-25 16:49:09] hold_for_later_2
[STOP] [2022-08-25 16:49:09] hold_for_later_2
[START] [2022-08-25 16:49:09] resolve_missing_parents
[STOP] [2022-08-25 16:49:09] resolve_missing_parents
[START] [2022-08-25 16:49:09] rebuild_nodes
[START] [2022-08-25 16:49:09] Flattener#flatten
[START] [2022-08-25 16:49:09] Flattener#study_resource
[START] [2022-08-25 16:49:09] Flattener#build_ancestry
[STOP] [2022-08-25 16:49:09] Flattener#build_ancestry
[INFO] [2022-08-25 16:49:09] 14 ancestry keys
[START] [2022-08-25 16:49:09] build_node_ancestors
[INFO] [2022-08-25 16:49:09] old ancestors deleted.
[STOP] [2022-08-25 16:49:09] build_node_ancestors
[WARN] [2022-08-25 16:49:09] Flattener: nothing to flatten! (Completely flat resource?)
[STOP] [2022-08-25 16:49:09] Flattener#flatten
[STOP] [2022-08-25 16:49:09] rebuild_nodes
[START] [2022-08-25 16:49:09] resolve_missing_media_owners
[STOP] [2022-08-25 16:49:09] resolve_missing_media_owners
[START] [2022-08-25 16:49:09] sanitize_media_verbatims
[STOP] [2022-08-25 16:49:09] sanitize_media_verbatims
[START] [2022-08-25 16:49:09] queue_downloads
[STOP] [2022-08-25 16:49:09] queue_downloads
[START] [2022-08-25 16:49:09] parse_names
[WARN] [2022-08-25 16:49:09] I see 14 names which still need to be parsed.
[WARN] [2022-08-25 16:49:09] Names to parse: 14 formatted: 14 learned: 14 parsed: 14
[STOP] [2022-08-25 16:49:10] parse_names
[START] [2022-08-25 16:49:10] denormalize_canonical_names_to_nodes
[STOP] [2022-08-25 16:49:10] denormalize_canonical_names_to_nodes
[START] [2022-08-25 16:49:10] match_nodes
[START] [2022-08-25 16:49:10] map_all_nodes_to_pages
[STOP] [2022-08-25 16:49:10] map_all_nodes_to_pages
[INFO] [2022-08-25 16:49:10] ZERO unmatched nodes (of 14)! Nicely done.
[START] [2022-08-25 16:49:10] update_nodes
[STOP] [2022-08-25 16:49:10] update_nodes
[STOP] [2022-08-25 16:49:10] match_nodes
[START] [2022-08-25 16:49:10] reindex_search
[STOP] [2022-08-25 16:49:12] reindex_search
[START] [2022-08-25 16:49:12] normalize_units
[STOP] [2022-08-25 16:49:12] normalize_units
[START] [2022-08-25 16:49:12] calculate_statistics
[2022-08-25 16:49:12] ZERO NODE ANCESTORS. Is this actually a completely flat resource?
[INFO] [2022-08-25 16:49:12] Duplicate page_id count: 0
[STOP] [2022-08-25 16:49:12] calculate_statistics
[START] [2022-08-25 16:49:12] complete_harvest_instance
[START] [2022-08-25 16:49:12] overall_tsv_creation
[INFO] [2022-08-25 16:49:12] Processing group of 14 in 1 batches of 10000
[INFO] [2022-08-25 16:51:47] 14 Traits (unfiltered)...
[INFO] [2022-08-25 16:51:47] Building Traits map (this can take a while)...
[INFO] [2022-08-25 16:52:51] Done. 14 traits mapped (14 meta).
[INFO] [2022-08-25 16:52:51] Building Associations map (this can take a while)...
[INFO] [2022-08-25 16:52:51] Done. 0 assocs mapped (0 meta).
[INFO] [2022-08-25 16:52:52] Adding 14 traits...
[INFO] [2022-08-25 16:52:52] 13 metadata added.
[INFO] [2022-08-25 16:52:52] Adding 0 assocs...
[INFO] [2022-08-25 16:52:52] 0 metadata added.
[INFO] [2022-08-25 16:53:42] Average Time: 143.42
[INFO] [2022-08-25 16:53:42] Total Time: 4m31s
[STOP] [2022-08-25 16:53:42] overall_tsv_creation
[INFO] [2022-08-25 16:53:42] Done. Check your files:
[INFO] [2022-08-25 16:53:42] (14 lines) /app/public/data/hotchkiss_hotchk/publish_nodes.tsv
[INFO] [2022-08-25 16:53:42] (14 lines) /app/public/data/hotchkiss_hotchk/publish_scientific_names.tsv
[INFO] [2022-08-25 16:53:42] (15 lines) /app/public/data/hotchkiss_hotchk/publish_traits.tsv
[INFO] [2022-08-25 16:53:42] (14 lines) /app/public/data/hotchkiss_hotchk/publish_metadata.tsv
[STOP] [2022-08-25 16:53:42] complete_harvest_instance
[START] [2022-08-25 16:53:42] completed
[STOP] [2022-08-25 16:53:43] completed
[STOP] [2022-08-25 16:53:43] logged process, took 281.67
Latest Process