Harvest for
Bitsch 2005
Created
13 Oct 13:43
Stage:
completed
Fetched:
13 Oct 13:43
Validated:
13 Oct 13:43
Deltas Created
13 Oct 13:43
Units Normalized:
13 Oct 13:43
Ancestry Built:
13 Oct 13:43
Nodes Matched:
13 Oct 13:43
Names Parsed:
13 Oct 13:43
New Models Stored:
13 Oct 13:43
Indexed:
13 Oct 13:43
Completed:
13 Oct 13:44
Time to Harvest:
less than a minute
Harvesting Log
(161 lines)
[INFO] [2023-10-13 13:43:48] Created harvest instance #4453
[STOP] [2023-10-13 13:43:48] create_harvest_instance
[START] [2023-10-13 13:43:48] fetch_files
[STOP] [2023-10-13 13:43:48] fetch_files
[START] [2023-10-13 13:43:48] validate_each_file
[INFO] [2023-10-13 13:43:48] Looping over 4 formats...
[INFO] [2023-10-13 13:43:48] ...refs (/app/public/data/bitsch_bitsch_20/references.tsv)
[INFO] [2023-10-13 13:43:48] Valid: /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_refs_30819.csv (2 lines)
[INFO] [2023-10-13 13:43:48] ...nodes (/app/public/data/bitsch_bitsch_20/taxa.txt)
[INFO] [2023-10-13 13:43:48] Valid: /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_nodes_30816.csv (31 lines)
[INFO] [2023-10-13 13:43:48] ...occurrences (/app/public/data/bitsch_bitsch_20/occurrences.txt)
[INFO] [2023-10-13 13:43:48] Valid: /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_occurrences_30817.csv (34 lines)
[INFO] [2023-10-13 13:43:48] ...measurements (/app/public/data/bitsch_bitsch_20/measurementsorfacts.txt)
[INFO] [2023-10-13 13:43:48] Valid: /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_measurements_30818.csv (236 lines)
[STOP] [2023-10-13 13:43:48] validate_each_file
[START] [2023-10-13 13:43:48] convert_to_csv
[INFO] [2023-10-13 13:43:48] Looping over 4 formats...
[INFO] [2023-10-13 13:43:48] ...refs (/app/public/data/bitsch_bitsch_20/references.tsv)
[CMD] [2023-10-13 13:43:48] /usr/bin/sort /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_refs_30819.csv > /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_refs_30819.csv_sorted
[INFO] [2023-10-13 13:43:48] Converted: /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_refs_30819.csv (2 lines)
[INFO] [2023-10-13 13:43:48] ...nodes (/app/public/data/bitsch_bitsch_20/taxa.txt)
[CMD] [2023-10-13 13:43:48] /usr/bin/sort /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_nodes_30816.csv > /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_nodes_30816.csv_sorted
[INFO] [2023-10-13 13:43:48] Converted: /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_nodes_30816.csv (31 lines)
[INFO] [2023-10-13 13:43:48] ...occurrences (/app/public/data/bitsch_bitsch_20/occurrences.txt)
[CMD] [2023-10-13 13:43:48] /usr/bin/sort /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_occurrences_30817.csv > /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_occurrences_30817.csv_sorted
[INFO] [2023-10-13 13:43:48] Converted: /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_occurrences_30817.csv (34 lines)
[INFO] [2023-10-13 13:43:48] ...measurements (/app/public/data/bitsch_bitsch_20/measurementsorfacts.txt)
[CMD] [2023-10-13 13:43:48] /usr/bin/sort /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_measurements_30818.csv > /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_measurements_30818.csv_sorted
[INFO] [2023-10-13 13:43:48] Converted: /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_measurements_30818.csv (236 lines)
[STOP] [2023-10-13 13:43:48] convert_to_csv
[START] [2023-10-13 13:43:48] calculate_delta
[INFO] [2023-10-13 13:43:48] Looping over 4 formats...
[INFO] [2023-10-13 13:43:48] ...refs (/app/public/data/bitsch_bitsch_20/references.tsv)
[CMD] [2023-10-13 13:43:48] echo "0a" > /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_refs_30819.diff
[CMD] [2023-10-13 13:43:48] tail -n +1 /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_refs_30819.csv >> /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_refs_30819.diff
[CMD] [2023-10-13 13:43:48] echo "." >> /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_refs_30819.diff
[INFO] [2023-10-13 13:43:48] Created diff: /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_refs_30819.diff (4 lines)
[INFO] [2023-10-13 13:43:48] ...nodes (/app/public/data/bitsch_bitsch_20/taxa.txt)
[CMD] [2023-10-13 13:43:48] echo "0a" > /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_nodes_30816.diff
[CMD] [2023-10-13 13:43:48] tail -n +1 /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_nodes_30816.csv >> /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_nodes_30816.diff
[CMD] [2023-10-13 13:43:48] echo "." >> /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_nodes_30816.diff
[INFO] [2023-10-13 13:43:48] Created diff: /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_nodes_30816.diff (33 lines)
[INFO] [2023-10-13 13:43:48] ...occurrences (/app/public/data/bitsch_bitsch_20/occurrences.txt)
[CMD] [2023-10-13 13:43:48] echo "0a" > /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_occurrences_30817.diff
[CMD] [2023-10-13 13:43:49] tail -n +1 /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_occurrences_30817.csv >> /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_occurrences_30817.diff
[CMD] [2023-10-13 13:43:49] echo "." >> /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_occurrences_30817.diff
[INFO] [2023-10-13 13:43:49] Created diff: /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_occurrences_30817.diff (36 lines)
[INFO] [2023-10-13 13:43:49] ...measurements (/app/public/data/bitsch_bitsch_20/measurementsorfacts.txt)
[CMD] [2023-10-13 13:43:49] echo "0a" > /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_measurements_30818.diff
[CMD] [2023-10-13 13:43:49] tail -n +1 /app/public/data/bitsch_bitsch_20/converted_csv/bitsch_bitsch_20_measurements_30818.csv >> /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_measurements_30818.diff
[CMD] [2023-10-13 13:43:49] echo "." >> /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_measurements_30818.diff
[INFO] [2023-10-13 13:43:49] Created diff: /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_measurements_30818.diff (238 lines)
[STOP] [2023-10-13 13:43:49] calculate_delta
[START] [2023-10-13 13:43:49] parse_diff_and_store
[INFO] [2023-10-13 13:43:49] Handling diff: /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_refs_30819.diff (4 lines)
[INFO] [2023-10-13 13:43:49] Loading refs diff file into memory (4 lines)...
[INFO] [2023-10-13 13:43:49] Storing 2 References (2/2/4)
[INFO] [2023-10-13 13:43:49] Handling diff: /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_nodes_30816.diff (33 lines)
[INFO] [2023-10-13 13:43:49] Loading nodes diff file into memory (33 lines)...
[INFO] [2023-10-13 13:43:49] Storing 31 ScientificNames (62/31/33)
[INFO] [2023-10-13 13:43:49] Storing 31 Nodes (62/31/33)
[INFO] [2023-10-13 13:43:49] Handling diff: /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_occurrences_30817.diff (36 lines)
[INFO] [2023-10-13 13:43:49] Loading occurrences diff file into memory (36 lines)...
[INFO] [2023-10-13 13:43:49] Storing 34 Occurrences (48/34/36)
[INFO] [2023-10-13 13:43:49] Storing 14 OccurrenceMetadata (48/34/36)
[INFO] [2023-10-13 13:43:49] Handling diff: /app/public/data/bitsch_bitsch_20/diff/bitsch_bitsch_20_measurements_30818.diff (238 lines)
[INFO] [2023-10-13 13:43:49] Loading measurements diff file into memory (238 lines)...
[INFO] [2023-10-13 13:43:49] Storing 236 Traits (276/236/238)
[INFO] [2023-10-13 13:43:50] Storing 38 MetaTraits (276/236/238)
[INFO] [2023-10-13 13:43:50] Storing 2 TraitsReferences (276/236/238)
[STOP] [2023-10-13 13:43:50] parse_diff_and_store
[START] [2023-10-13 13:43:50] resolve_keys
[2023-10-13 13:43:50] Resolving downloaded urls (this is not actually downloading them yet)
[INFO] [2023-10-13 13:43:57] Occurrences to nodes (through scientific_names)...
[INFO] [2023-10-13 13:43:57] traits to occurrences...
[INFO] [2023-10-13 13:43:57] traits to nodes (through occurrences)...
[INFO] [2023-10-13 13:43:57] Traits to sex term...
[INFO] [2023-10-13 13:43:57] Traits to lifestage term...
[INFO] [2023-10-13 13:43:57] MetaTraits to traits...
[INFO] [2023-10-13 13:43:57] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2023-10-13 13:43:57] Assocs to occurrences...
[INFO] [2023-10-13 13:43:57] Assocs to nodes...
[INFO] [2023-10-13 13:43:57] Assoc to sex term...
[INFO] [2023-10-13 13:43:57] Assoc to lifestage term...
[INFO] [2023-10-13 13:43:57] MetaAssoc to assocs...
[STOP] [2023-10-13 13:43:58] resolve_keys
[START] [2023-10-13 13:43:58] hold_for_later_1
[STOP] [2023-10-13 13:43:58] hold_for_later_1
[START] [2023-10-13 13:43:58] hold_for_later_2
[STOP] [2023-10-13 13:43:58] hold_for_later_2
[START] [2023-10-13 13:43:58] resolve_missing_parents
[STOP] [2023-10-13 13:43:58] resolve_missing_parents
[START] [2023-10-13 13:43:58] rebuild_nodes
[START] [2023-10-13 13:43:58] Flattener#flatten
[START] [2023-10-13 13:43:58] Flattener#study_resource
[START] [2023-10-13 13:43:58] Flattener#build_ancestry
[STOP] [2023-10-13 13:43:58] Flattener#build_ancestry
[INFO] [2023-10-13 13:43:58] 31 ancestry keys
[START] [2023-10-13 13:43:58] build_node_ancestors
[INFO] [2023-10-13 13:43:58] old ancestors deleted.
[STOP] [2023-10-13 13:43:58] build_node_ancestors
[WARN] [2023-10-13 13:43:58] Flattener: nothing to flatten! (Completely flat resource?)
[STOP] [2023-10-13 13:43:58] Flattener#flatten
[STOP] [2023-10-13 13:43:58] rebuild_nodes
[START] [2023-10-13 13:43:58] resolve_missing_media_owners
[STOP] [2023-10-13 13:43:58] resolve_missing_media_owners
[START] [2023-10-13 13:43:58] sanitize_media_verbatims
[STOP] [2023-10-13 13:43:58] sanitize_media_verbatims
[START] [2023-10-13 13:43:58] queue_downloads
[STOP] [2023-10-13 13:43:58] queue_downloads
[START] [2023-10-13 13:43:58] parse_names
[WARN] [2023-10-13 13:43:58] I see 31 names which still need to be parsed.
[WARN] [2023-10-13 13:43:58] Names to parse: 31 formatted: 31 learned: 31 parsed: 31
[STOP] [2023-10-13 13:43:59] parse_names
[START] [2023-10-13 13:43:59] denormalize_canonical_names_to_nodes
[STOP] [2023-10-13 13:43:59] denormalize_canonical_names_to_nodes
[START] [2023-10-13 13:43:59] match_nodes
[START] [2023-10-13 13:43:59] map_all_nodes_to_pages
[STOP] [2023-10-13 13:43:59] map_all_nodes_to_pages
[INFO] [2023-10-13 13:43:59] ZERO unmatched nodes (of 31)! Nicely done.
[START] [2023-10-13 13:43:59] update_nodes
[STOP] [2023-10-13 13:43:59] update_nodes
[STOP] [2023-10-13 13:43:59] match_nodes
[START] [2023-10-13 13:43:59] reindex_search
[STOP] [2023-10-13 13:43:59] reindex_search
[START] [2023-10-13 13:43:59] normalize_units
[STOP] [2023-10-13 13:43:59] normalize_units
[START] [2023-10-13 13:43:59] calculate_statistics
[2023-10-13 13:43:59] ZERO NODE ANCESTORS. Is this actually a completely flat resource?
[INFO] [2023-10-13 13:43:59] Duplicate page_id count: 0
[STOP] [2023-10-13 13:43:59] calculate_statistics
[START] [2023-10-13 13:43:59] complete_harvest_instance
[START] [2023-10-13 13:43:59] overall_tsv_creation
[INFO] [2023-10-13 13:43:59] Exporting 31 nodes as TSV in batches of 10000...
[INFO] [2023-10-13 13:43:59] Processing group of 31 in 1 batches of 10000
[INFO] [2023-10-13 13:43:59] 29 Traits (unfiltered) and 0 associations...
[INFO] [2023-10-13 13:43:59] Building Traits map for 31 nodes (this can take a while)...
[INFO] [2023-10-13 13:43:59] Mapped 29 traits (31 meta) for 31 nodes.
[INFO] [2023-10-13 13:43:59] Building Associations map (this can take a while)...
[INFO] [2023-10-13 13:43:59] Done. 0 assocs mapped (0 meta).
[INFO] [2023-10-13 13:43:59] Adding 29 traits...
[INFO] [2023-10-13 13:43:59] Trait #291664731 in key 291664731 has 24 metadata... that seems high?
[INFO] [2023-10-13 13:43:59] Trait #291664831 in key 291664831 has 29 metadata... that seems high?
[INFO] [2023-10-13 13:43:59] Trait #291664901 in key 291664901 has 51 metadata... that seems high?
[INFO] [2023-10-13 13:43:59] Trait #291664964 in key 291664964 has 26 metadata... that seems high?
[INFO] [2023-10-13 13:43:59] 191 metadata added.
[INFO] [2023-10-13 13:43:59] Adding 0 assocs...
[INFO] [2023-10-13 13:43:59] 0 metadata added.
[INFO] [2023-10-13 13:44:43] Processed 31/31 nodes
[INFO] [2023-10-13 13:44:43] Average Time: 44.12
[INFO] [2023-10-13 13:44:43] Total Time: 45s
[STOP] [2023-10-13 13:44:43] overall_tsv_creation
[INFO] [2023-10-13 13:44:43] Done. Check your files:
[INFO] [2023-10-13 13:44:44] (31 lines) /app/public/data/bitsch_bitsch_20/publish_nodes.tsv
[INFO] [2023-10-13 13:44:44] (31 lines) /app/public/data/bitsch_bitsch_20/publish_scientific_names.tsv
[INFO] [2023-10-13 13:44:44] (30 lines) /app/public/data/bitsch_bitsch_20/publish_traits.tsv
[INFO] [2023-10-13 13:44:44] (192 lines) /app/public/data/bitsch_bitsch_20/publish_metadata.tsv
[STOP] [2023-10-13 13:44:44] complete_harvest_instance
[START] [2023-10-13 13:44:44] completed
[STOP] [2023-10-13 13:44:44] completed
[STOP] [2023-10-13 13:44:44] logged process, took 56.27
Latest Process