Stage:
completed
Fetched:
19 Jan 14:02
Validated:
19 Jan 14:02
Deltas Created
19 Jan 14:02
Units Normalized:
19 Jan 14:02
Ancestry Built:
19 Jan 14:02
Nodes Matched:
19 Jan 14:02
Names Parsed:
19 Jan 14:02
New Models Stored:
19 Jan 14:02
Indexed:
19 Jan 14:02
Completed:
19 Jan 14:03
Time to Harvest:
less than a minute
Harvesting Log
(142 lines)
[INFO] [2023-01-19 14:02:14] Created harvest instance #4256
[STOP] [2023-01-19 14:02:14] create_harvest_instance
[START] [2023-01-19 14:02:14] fetch_files
[STOP] [2023-01-19 14:02:14] fetch_files
[START] [2023-01-19 14:02:14] validate_each_file
[INFO] [2023-01-19 14:02:14] Looping over 3 formats...
[INFO] [2023-01-19 14:02:14] ...nodes (/app/public/data/ankel_simons_ank/taxa.txt)
[INFO] [2023-01-19 14:02:14] Valid: /app/public/data/ankel_simons_ank/converted_csv/ankel_simons_ank_nodes_30016.csv (17 lines)
[INFO] [2023-01-19 14:02:14] ...occurrences (/app/public/data/ankel_simons_ank/occurrences.txt)
[INFO] [2023-01-19 14:02:14] Valid: /app/public/data/ankel_simons_ank/converted_csv/ankel_simons_ank_occurrences_30017.csv (17 lines)
[INFO] [2023-01-19 14:02:14] ...measurements (/app/public/data/ankel_simons_ank/measurementsorfacts.txt)
[INFO] [2023-01-19 14:02:14] Valid: /app/public/data/ankel_simons_ank/converted_csv/ankel_simons_ank_measurements_30018.csv (34 lines)
[STOP] [2023-01-19 14:02:14] validate_each_file
[START] [2023-01-19 14:02:14] convert_to_csv
[INFO] [2023-01-19 14:02:14] Looping over 3 formats...
[INFO] [2023-01-19 14:02:14] ...nodes (/app/public/data/ankel_simons_ank/taxa.txt)
[CMD] [2023-01-19 14:02:14] /usr/bin/sort /app/public/data/ankel_simons_ank/converted_csv/ankel_simons_ank_nodes_30016.csv > /app/public/data/ankel_simons_ank/converted_csv/ankel_simons_ank_nodes_30016.csv_sorted
[INFO] [2023-01-19 14:02:14] Converted: /app/public/data/ankel_simons_ank/converted_csv/ankel_simons_ank_nodes_30016.csv (17 lines)
[INFO] [2023-01-19 14:02:14] ...occurrences (/app/public/data/ankel_simons_ank/occurrences.txt)
[CMD] [2023-01-19 14:02:14] /usr/bin/sort /app/public/data/ankel_simons_ank/converted_csv/ankel_simons_ank_occurrences_30017.csv > /app/public/data/ankel_simons_ank/converted_csv/ankel_simons_ank_occurrences_30017.csv_sorted
[INFO] [2023-01-19 14:02:14] Converted: /app/public/data/ankel_simons_ank/converted_csv/ankel_simons_ank_occurrences_30017.csv (17 lines)
[INFO] [2023-01-19 14:02:14] ...measurements (/app/public/data/ankel_simons_ank/measurementsorfacts.txt)
[CMD] [2023-01-19 14:02:14] /usr/bin/sort /app/public/data/ankel_simons_ank/converted_csv/ankel_simons_ank_measurements_30018.csv > /app/public/data/ankel_simons_ank/converted_csv/ankel_simons_ank_measurements_30018.csv_sorted
[INFO] [2023-01-19 14:02:14] Converted: /app/public/data/ankel_simons_ank/converted_csv/ankel_simons_ank_measurements_30018.csv (34 lines)
[STOP] [2023-01-19 14:02:14] convert_to_csv
[START] [2023-01-19 14:02:14] calculate_delta
[INFO] [2023-01-19 14:02:14] Looping over 3 formats...
[INFO] [2023-01-19 14:02:14] ...nodes (/app/public/data/ankel_simons_ank/taxa.txt)
[CMD] [2023-01-19 14:02:14] echo "0a" > /app/public/data/ankel_simons_ank/diff/ankel_simons_ank_nodes_30016.diff
[CMD] [2023-01-19 14:02:15] tail -n +1 /app/public/data/ankel_simons_ank/converted_csv/ankel_simons_ank_nodes_30016.csv >> /app/public/data/ankel_simons_ank/diff/ankel_simons_ank_nodes_30016.diff
[CMD] [2023-01-19 14:02:15] echo "." >> /app/public/data/ankel_simons_ank/diff/ankel_simons_ank_nodes_30016.diff
[INFO] [2023-01-19 14:02:15] Created diff: /app/public/data/ankel_simons_ank/diff/ankel_simons_ank_nodes_30016.diff (19 lines)
[INFO] [2023-01-19 14:02:15] ...occurrences (/app/public/data/ankel_simons_ank/occurrences.txt)
[CMD] [2023-01-19 14:02:15] echo "0a" > /app/public/data/ankel_simons_ank/diff/ankel_simons_ank_occurrences_30017.diff
[CMD] [2023-01-19 14:02:15] tail -n +1 /app/public/data/ankel_simons_ank/converted_csv/ankel_simons_ank_occurrences_30017.csv >> /app/public/data/ankel_simons_ank/diff/ankel_simons_ank_occurrences_30017.diff
[CMD] [2023-01-19 14:02:15] echo "." >> /app/public/data/ankel_simons_ank/diff/ankel_simons_ank_occurrences_30017.diff
[INFO] [2023-01-19 14:02:15] Created diff: /app/public/data/ankel_simons_ank/diff/ankel_simons_ank_occurrences_30017.diff (19 lines)
[INFO] [2023-01-19 14:02:15] ...measurements (/app/public/data/ankel_simons_ank/measurementsorfacts.txt)
[CMD] [2023-01-19 14:02:15] echo "0a" > /app/public/data/ankel_simons_ank/diff/ankel_simons_ank_measurements_30018.diff
[CMD] [2023-01-19 14:02:15] tail -n +1 /app/public/data/ankel_simons_ank/converted_csv/ankel_simons_ank_measurements_30018.csv >> /app/public/data/ankel_simons_ank/diff/ankel_simons_ank_measurements_30018.diff
[CMD] [2023-01-19 14:02:15] echo "." >> /app/public/data/ankel_simons_ank/diff/ankel_simons_ank_measurements_30018.diff
[INFO] [2023-01-19 14:02:15] Created diff: /app/public/data/ankel_simons_ank/diff/ankel_simons_ank_measurements_30018.diff (36 lines)
[STOP] [2023-01-19 14:02:15] calculate_delta
[START] [2023-01-19 14:02:15] parse_diff_and_store
[INFO] [2023-01-19 14:02:15] Handling diff: /app/public/data/ankel_simons_ank/diff/ankel_simons_ank_nodes_30016.diff (19 lines)
[INFO] [2023-01-19 14:02:15] Loading nodes diff file into memory (19 lines)...
[INFO] [2023-01-19 14:02:15] Storing 17 ScientificNames (34/17/19)
[INFO] [2023-01-19 14:02:15] Storing 17 Nodes (34/17/19)
[INFO] [2023-01-19 14:02:15] Handling diff: /app/public/data/ankel_simons_ank/diff/ankel_simons_ank_occurrences_30017.diff (19 lines)
[INFO] [2023-01-19 14:02:15] Loading occurrences diff file into memory (19 lines)...
[INFO] [2023-01-19 14:02:15] Storing 17 Occurrences (17/17/19)
[INFO] [2023-01-19 14:02:15] Handling diff: /app/public/data/ankel_simons_ank/diff/ankel_simons_ank_measurements_30018.diff (36 lines)
[INFO] [2023-01-19 14:02:15] Loading measurements diff file into memory (36 lines)...
[INFO] [2023-01-19 14:02:15] Storing 34 Traits (51/34/36)
[INFO] [2023-01-19 14:02:15] Storing 17 MetaTraits (51/34/36)
[STOP] [2023-01-19 14:02:15] parse_diff_and_store
[START] [2023-01-19 14:02:15] resolve_keys
[2023-01-19 14:02:15] Resolving downloaded urls (this is not actually downloading them yet)
[INFO] [2023-01-19 14:02:22] Occurrences to nodes (through scientific_names)...
[INFO] [2023-01-19 14:02:22] traits to occurrences...
[INFO] [2023-01-19 14:02:22] traits to nodes (through occurrences)...
[INFO] [2023-01-19 14:02:22] Traits to sex term...
[INFO] [2023-01-19 14:02:22] Traits to lifestage term...
[INFO] [2023-01-19 14:02:22] MetaTraits to traits...
[INFO] [2023-01-19 14:02:22] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2023-01-19 14:02:22] Assocs to occurrences...
[INFO] [2023-01-19 14:02:22] Assocs to nodes...
[INFO] [2023-01-19 14:02:22] Assoc to sex term...
[INFO] [2023-01-19 14:02:22] Assoc to lifestage term...
[INFO] [2023-01-19 14:02:22] MetaAssoc to assocs...
[STOP] [2023-01-19 14:02:22] resolve_keys
[START] [2023-01-19 14:02:22] hold_for_later_1
[STOP] [2023-01-19 14:02:22] hold_for_later_1
[START] [2023-01-19 14:02:22] hold_for_later_2
[STOP] [2023-01-19 14:02:22] hold_for_later_2
[START] [2023-01-19 14:02:22] resolve_missing_parents
[STOP] [2023-01-19 14:02:22] resolve_missing_parents
[START] [2023-01-19 14:02:22] rebuild_nodes
[START] [2023-01-19 14:02:22] Flattener#flatten
[START] [2023-01-19 14:02:22] Flattener#study_resource
[START] [2023-01-19 14:02:22] Flattener#build_ancestry
[STOP] [2023-01-19 14:02:22] Flattener#build_ancestry
[INFO] [2023-01-19 14:02:22] 17 ancestry keys
[START] [2023-01-19 14:02:22] build_node_ancestors
[INFO] [2023-01-19 14:02:22] old ancestors deleted.
[STOP] [2023-01-19 14:02:22] build_node_ancestors
[WARN] [2023-01-19 14:02:22] Flattener: nothing to flatten! (Completely flat resource?)
[STOP] [2023-01-19 14:02:22] Flattener#flatten
[STOP] [2023-01-19 14:02:22] rebuild_nodes
[START] [2023-01-19 14:02:22] resolve_missing_media_owners
[STOP] [2023-01-19 14:02:22] resolve_missing_media_owners
[START] [2023-01-19 14:02:22] sanitize_media_verbatims
[STOP] [2023-01-19 14:02:22] sanitize_media_verbatims
[START] [2023-01-19 14:02:22] queue_downloads
[STOP] [2023-01-19 14:02:22] queue_downloads
[START] [2023-01-19 14:02:22] parse_names
[WARN] [2023-01-19 14:02:22] I see 17 names which still need to be parsed.
[WARN] [2023-01-19 14:02:23] Names to parse: 17 formatted: 17 learned: 17 parsed: 17
[STOP] [2023-01-19 14:02:24] parse_names
[START] [2023-01-19 14:02:24] denormalize_canonical_names_to_nodes
[STOP] [2023-01-19 14:02:24] denormalize_canonical_names_to_nodes
[START] [2023-01-19 14:02:24] match_nodes
[START] [2023-01-19 14:02:24] map_all_nodes_to_pages
[STOP] [2023-01-19 14:02:24] map_all_nodes_to_pages
[INFO] [2023-01-19 14:02:24] ZERO unmatched nodes (of 17)! Nicely done.
[START] [2023-01-19 14:02:24] update_nodes
[STOP] [2023-01-19 14:02:24] update_nodes
[STOP] [2023-01-19 14:02:24] match_nodes
[START] [2023-01-19 14:02:24] reindex_search
[STOP] [2023-01-19 14:02:24] reindex_search
[START] [2023-01-19 14:02:24] normalize_units
[STOP] [2023-01-19 14:02:24] normalize_units
[START] [2023-01-19 14:02:24] calculate_statistics
[2023-01-19 14:02:24] ZERO NODE ANCESTORS. Is this actually a completely flat resource?
[INFO] [2023-01-19 14:02:24] Duplicate page_id count: 0
[STOP] [2023-01-19 14:02:24] calculate_statistics
[START] [2023-01-19 14:02:24] complete_harvest_instance
[START] [2023-01-19 14:02:24] overall_tsv_creation
[INFO] [2023-01-19 14:02:24] Exporting 17 nodes as TSV in batches of 10000...
[INFO] [2023-01-19 14:02:24] Processing group of 17 in 1 batches of 10000
[INFO] [2023-01-19 14:02:24] 17 Traits (unfiltered) and 0 associations...
[INFO] [2023-01-19 14:02:24] Building Traits map for 17 nodes (this can take a while)...
[INFO] [2023-01-19 14:02:24] Mapped 17 traits (17 meta) for 17 nodes.
[INFO] [2023-01-19 14:02:24] Building Associations map (this can take a while)...
[INFO] [2023-01-19 14:02:24] Done. 0 assocs mapped (0 meta).
[INFO] [2023-01-19 14:02:24] Adding 17 traits...
[INFO] [2023-01-19 14:02:24] 17 metadata added.
[INFO] [2023-01-19 14:02:24] Adding 0 assocs...
[INFO] [2023-01-19 14:02:24] 0 metadata added.
[INFO] [2023-01-19 14:03:08] Processed 17/17 nodes
[INFO] [2023-01-19 14:03:08] Average Time: 43.76
[INFO] [2023-01-19 14:03:08] Total Time: 44s
[STOP] [2023-01-19 14:03:08] overall_tsv_creation
[INFO] [2023-01-19 14:03:08] Done. Check your files:
[INFO] [2023-01-19 14:03:08] (17 lines) /app/public/data/ankel_simons_ank/publish_nodes.tsv
[INFO] [2023-01-19 14:03:08] (17 lines) /app/public/data/ankel_simons_ank/publish_scientific_names.tsv
[INFO] [2023-01-19 14:03:08] (18 lines) /app/public/data/ankel_simons_ank/publish_traits.tsv
[INFO] [2023-01-19 14:03:08] (18 lines) /app/public/data/ankel_simons_ank/publish_metadata.tsv
[STOP] [2023-01-19 14:03:08] complete_harvest_instance
[START] [2023-01-19 14:03:08] completed
[STOP] [2023-01-19 14:03:08] completed
[STOP] [2023-01-19 14:03:08] logged process, took 53.81
Latest Process