Harvest for Thomas J Walker's insect recordings Created 11 Jul 15:12

Stage: completed
Fetched: 11 Jul 15:12
Validated: 11 Jul 15:12
Deltas Created 11 Jul 15:12
Units Normalized: 11 Jul 15:12
Ancestry Built: 11 Jul 15:12
Nodes Matched: 11 Jul 15:12
Names Parsed: 11 Jul 15:12
New Models Stored: 11 Jul 15:12
Indexed: 11 Jul 15:12
Completed: 11 Jul 15:12
Time to Harvest: less than a minute

Harvesting Log

(119 lines)
# Logfile created on 2019-07-11 15:12:03 -0400 by logger.rb/56815
[START] [2019-07-11 15:12:03] logged process
[START] [2019-07-11 15:12:03] create_harvest_instance
[STOP] [2019-07-11 15:12:03] create_harvest_instance
[START] [2019-07-11 15:12:03] fetch_files
[STOP] [2019-07-11 15:12:03] fetch_files
[START] [2019-07-11 15:12:03] validate_each_file
[STOP] [2019-07-11 15:12:04] validate_each_file
[START] [2019-07-11 15:12:04] convert_to_csv
[CMD] [2019-07-11 15:12:04] /usr/bin/sort /app/public/converted_csv/ttwsir_agents_14195.csv > /app/public/converted_csv/ttwsir_agents_14195.csv_sorted
[CMD] [2019-07-11 15:12:04] /usr/bin/sort /app/public/converted_csv/ttwsir_nodes_14196.csv > /app/public/converted_csv/ttwsir_nodes_14196.csv_sorted
[CMD] [2019-07-11 15:12:04] /usr/bin/sort /app/public/converted_csv/ttwsir_media_14197.csv > /app/public/converted_csv/ttwsir_media_14197.csv_sorted
[STOP] [2019-07-11 15:12:04] convert_to_csv
[START] [2019-07-11 15:12:04] calculate_delta
[CMD] [2019-07-11 15:12:04] echo "0a" > /app/public/diff/ttwsir_agents_14195.diff
[CMD] [2019-07-11 15:12:04] tail -n +1 /app/public/converted_csv/ttwsir_agents_14195.csv >> /app/public/diff/ttwsir_agents_14195.diff
[CMD] [2019-07-11 15:12:04] echo "." >> /app/public/diff/ttwsir_agents_14195.diff
[CMD] [2019-07-11 15:12:04] echo "0a" > /app/public/diff/ttwsir_nodes_14196.diff
[CMD] [2019-07-11 15:12:04] tail -n +1 /app/public/converted_csv/ttwsir_nodes_14196.csv >> /app/public/diff/ttwsir_nodes_14196.diff
[CMD] [2019-07-11 15:12:04] echo "." >> /app/public/diff/ttwsir_nodes_14196.diff
[CMD] [2019-07-11 15:12:04] echo "0a" > /app/public/diff/ttwsir_media_14197.diff
[CMD] [2019-07-11 15:12:04] tail -n +1 /app/public/converted_csv/ttwsir_media_14197.csv >> /app/public/diff/ttwsir_media_14197.diff
[CMD] [2019-07-11 15:12:05] echo "." >> /app/public/diff/ttwsir_media_14197.diff
[STOP] [2019-07-11 15:12:05] calculate_delta
[START] [2019-07-11 15:12:05] parse_diff_and_store
[INFO] [2019-07-11 15:12:05] Loading agents diff file into memory (true lines)...
[INFO] [2019-07-11 15:12:05] Loading nodes diff file into memory (true lines)...
[INFO] [2019-07-11 15:12:05] Loading media diff file into memory (true lines)...
[INFO] [2019-07-11 15:12:07] Storing 1 Attributions
[INFO] [2019-07-11 15:12:07] Processing group of 1 in 1 groups of 1000
[INFO] [2019-07-11 15:12:07] Average Time: 0.0
[INFO] [2019-07-11 15:12:07] Total Time: 1s
[INFO] [2019-07-11 15:12:07] Storing 253 ScientificNames
[INFO] [2019-07-11 15:12:07] Processing group of 253 in 1 groups of 1000
[INFO] [2019-07-11 15:12:07] Average Time: 0.29
[INFO] [2019-07-11 15:12:07] Total Time: 1s
[INFO] [2019-07-11 15:12:07] Storing 253 Nodes
[INFO] [2019-07-11 15:12:07] Processing group of 253 in 1 groups of 1000
[INFO] [2019-07-11 15:12:07] Average Time: 0.12
[INFO] [2019-07-11 15:12:07] Total Time: 1s
[INFO] [2019-07-11 15:12:07] Storing 1510 ContentAttributions
[INFO] [2019-07-11 15:12:07] Processing group of 1510 in 2 groups of 1000
[INFO] [2019-07-11 15:12:07] Average Time: 0.125
[INFO] [2019-07-11 15:12:07] Total Time: 1s
[INFO] [2019-07-11 15:12:07] Storing 1510 Media
[INFO] [2019-07-11 15:12:07] Processing group of 1510 in 2 groups of 1000
[INFO] [2019-07-11 15:12:08] Average Time: 0.355
[INFO] [2019-07-11 15:12:08] Total Time: 1s
[STOP] [2019-07-11 15:12:08] parse_diff_and_store
[START] [2019-07-11 15:12:08] resolve_keys
[INFO] [2019-07-11 15:12:12] Occurrences to nodes (through scientific_names)...
[INFO] [2019-07-11 15:12:12] traits to occurrences...
[INFO] [2019-07-11 15:12:12] traits to nodes (through occurrences)...
[INFO] [2019-07-11 15:12:12] Traits to sex term...
[INFO] [2019-07-11 15:12:12] Traits to lifestage term...
[INFO] [2019-07-11 15:12:12] MetaTraits to traits...
[INFO] [2019-07-11 15:12:12] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-07-11 15:12:12] Assocs to occurrences...
[INFO] [2019-07-11 15:12:12] Assocs to nodes...
[INFO] [2019-07-11 15:12:12] Assoc to sex term...
[INFO] [2019-07-11 15:12:12] Assoc to lifestage term...
[STOP] [2019-07-11 15:12:12] resolve_keys
[START] [2019-07-11 15:12:12] hold_for_later_1
[STOP] [2019-07-11 15:12:12] hold_for_later_1
[START] [2019-07-11 15:12:12] hold_for_later_2
[STOP] [2019-07-11 15:12:12] hold_for_later_2
[START] [2019-07-11 15:12:12] resolve_missing_parents
[STOP] [2019-07-11 15:12:12] resolve_missing_parents
[START] [2019-07-11 15:12:12] rebuild_nodes
[START] [2019-07-11 15:12:12] Flattener#flatten
[START] [2019-07-11 15:12:12] Flattener#study_resource
[START] [2019-07-11 15:12:12] Flattener#build_ancestry
[STOP] [2019-07-11 15:12:12] Flattener#build_ancestry
[INFO] [2019-07-11 15:12:12] 253 ancestry keys
[START] [2019-07-11 15:12:12] build_node_ancestors
[INFO] [2019-07-11 15:12:12] old ancestors deleted.
[STOP] [2019-07-11 15:12:12] build_node_ancestors
[WARN] [2019-07-11 15:12:12] Flattener: nothing to flatten! (Completely flat resource?)
[STOP] [2019-07-11 15:12:12] Flattener#flatten
[STOP] [2019-07-11 15:12:12] rebuild_nodes
[START] [2019-07-11 15:12:12] resolve_missing_media_owners
[STOP] [2019-07-11 15:12:12] resolve_missing_media_owners
[START] [2019-07-11 15:12:12] sanitize_media_verbatims
[STOP] [2019-07-11 15:12:12] sanitize_media_verbatims
[START] [2019-07-11 15:12:12] queue_downloads
[STOP] [2019-07-11 15:12:12] queue_downloads
[START] [2019-07-11 15:12:12] parse_names
[WARN] [2019-07-11 15:12:12] I see 253 names which still need to be parsed.
[STOP] [2019-07-11 15:12:13] parse_names
[START] [2019-07-11 15:12:13] denormalize_canonical_names_to_nodes
[STOP] [2019-07-11 15:12:13] denormalize_canonical_names_to_nodes
[START] [2019-07-11 15:12:13] match_nodes
[START] [2019-07-11 15:12:13] map_all_nodes_to_pages
[STOP] [2019-07-11 15:12:18] map_all_nodes_to_pages
[INFO] [2019-07-11 15:12:18] 18 Unmatched nodes (of 253)! That's too many to output. First 10: Caribophyllum caymenensis (#44160689); Hispanogryllus illotus (#44160731); Miogryllus saussurei oklahomae (#44160740); Romalea microptera (#44160745); Belocephalus subapterus uncinatus (#44160759); Neoconocephalus affinis (#44160766); Hispanogryllus ochleros (#44160768); Tibicen marginalis (#44160771); Orocharis volatus (#44160772); Orocharis canaster (#44160774)
[START] [2019-07-11 15:12:18] update_nodes
[STOP] [2019-07-11 15:12:18] update_nodes
[STOP] [2019-07-11 15:12:18] match_nodes
[START] [2019-07-11 15:12:18] reindex_search
[STOP] [2019-07-11 15:12:19] reindex_search
[START] [2019-07-11 15:12:19] normalize_units
[STOP] [2019-07-11 15:12:19] normalize_units
[START] [2019-07-11 15:12:19] calculate_statistics
[STOP] [2019-07-11 15:12:19] calculate_statistics
[START] [2019-07-11 15:12:19] complete_harvest_instance
[START] [2019-07-11 15:12:19] overall_tsv_creation
[INFO] [2019-07-11 15:12:19] Processing group of 253 in 1 batches of 10000
[INFO] [2019-07-11 15:12:58] Average Time: 16.71
[INFO] [2019-07-11 15:12:58] Total Time: 40s
[STOP] [2019-07-11 15:12:58] overall_tsv_creation
[INFO] [2019-07-11 15:12:58] Done. Check your files:
[INFO] [2019-07-11 15:12:58] (253 lines) /app/public/data/ttwsir/publish_nodes.tsv
[INFO] [2019-07-11 15:12:58] (253 lines) /app/public/data/ttwsir/publish_scientific_names.tsv
[INFO] [2019-07-11 15:12:58] (1510 lines) /app/public/data/ttwsir/publish_media.tsv
[INFO] [2019-07-11 15:12:59] (1510 lines) /app/public/data/ttwsir/publish_attributions.tsv
[STOP] [2019-07-11 15:12:59] complete_harvest_instance
[START] [2019-07-11 15:12:59] completed
[STOP] [2019-07-11 15:12:59] completed
[STOP] [2019-07-11 15:12:59] logged process, took 55.67

Latest Process