Stage:
completed
Fetched:
15 Oct 18:17
Validated:
15 Oct 18:17
Deltas Created
15 Oct 18:17
Units Normalized:
15 Oct 18:23
Ancestry Built:
15 Oct 18:18
Nodes Matched:
15 Oct 18:23
Names Parsed:
15 Oct 18:18
New Models Stored:
15 Oct 18:17
Indexed:
15 Oct 18:23
Completed:
15 Oct 18:26
Time to Harvest:
less than a minute
Harvesting Log
(139 lines)
# Logfile created on 2019-10-15 18:17:29 -0400 by logger.rb/56815
[START] [2019-10-15 18:17:29] logged process
[START] [2019-10-15 18:17:29] create_harvest_instance
[STOP] [2019-10-15 18:17:30] create_harvest_instance
[START] [2019-10-15 18:17:30] fetch_files
[STOP] [2019-10-15 18:17:30] fetch_files
[START] [2019-10-15 18:17:30] validate_each_file
[STOP] [2019-10-15 18:17:31] validate_each_file
[START] [2019-10-15 18:17:31] convert_to_csv
[CMD] [2019-10-15 18:17:31] /usr/bin/sort /app/public/converted_csv/sierra_leone_sp__refs_17181.csv > /app/public/converted_csv/sierra_leone_sp__refs_17181.csv_sorted
[CMD] [2019-10-15 18:17:31] /usr/bin/sort /app/public/converted_csv/sierra_leone_sp__nodes_17182.csv > /app/public/converted_csv/sierra_leone_sp__nodes_17182.csv_sorted
[CMD] [2019-10-15 18:17:31] /usr/bin/sort /app/public/converted_csv/sierra_leone_sp__occurrences_17183.csv > /app/public/converted_csv/sierra_leone_sp__occurrences_17183.csv_sorted
[CMD] [2019-10-15 18:17:31] /usr/bin/sort /app/public/converted_csv/sierra_leone_sp__measurements_17184.csv > /app/public/converted_csv/sierra_leone_sp__measurements_17184.csv_sorted
[STOP] [2019-10-15 18:17:32] convert_to_csv
[START] [2019-10-15 18:17:32] calculate_delta
[CMD] [2019-10-15 18:17:32] echo "0a" > /app/public/diff/sierra_leone_sp__refs_17181.diff
[CMD] [2019-10-15 18:17:32] tail -n +1 /app/public/converted_csv/sierra_leone_sp__refs_17181.csv >> /app/public/diff/sierra_leone_sp__refs_17181.diff
[CMD] [2019-10-15 18:17:32] echo "." >> /app/public/diff/sierra_leone_sp__refs_17181.diff
[CMD] [2019-10-15 18:17:33] echo "0a" > /app/public/diff/sierra_leone_sp__nodes_17182.diff
[CMD] [2019-10-15 18:17:33] tail -n +1 /app/public/converted_csv/sierra_leone_sp__nodes_17182.csv >> /app/public/diff/sierra_leone_sp__nodes_17182.diff
[CMD] [2019-10-15 18:17:33] echo "." >> /app/public/diff/sierra_leone_sp__nodes_17182.diff
[CMD] [2019-10-15 18:17:33] echo "0a" > /app/public/diff/sierra_leone_sp__occurrences_17183.diff
[CMD] [2019-10-15 18:17:34] tail -n +1 /app/public/converted_csv/sierra_leone_sp__occurrences_17183.csv >> /app/public/diff/sierra_leone_sp__occurrences_17183.diff
[CMD] [2019-10-15 18:17:34] echo "." >> /app/public/diff/sierra_leone_sp__occurrences_17183.diff
[CMD] [2019-10-15 18:17:34] echo "0a" > /app/public/diff/sierra_leone_sp__measurements_17184.diff
[CMD] [2019-10-15 18:17:35] tail -n +1 /app/public/converted_csv/sierra_leone_sp__measurements_17184.csv >> /app/public/diff/sierra_leone_sp__measurements_17184.diff
[CMD] [2019-10-15 18:17:35] echo "." >> /app/public/diff/sierra_leone_sp__measurements_17184.diff
[STOP] [2019-10-15 18:17:35] calculate_delta
[START] [2019-10-15 18:17:35] parse_diff_and_store
[INFO] [2019-10-15 18:17:35] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-15 18:17:36] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-15 18:17:38] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-15 18:17:38] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-15 18:17:53] Storing 2 References
[INFO] [2019-10-15 18:17:53] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-15 18:17:53] Average Time: 0.0
[INFO] [2019-10-15 18:17:53] Total Time: 1s
[INFO] [2019-10-15 18:17:53] Storing 4421 ScientificNames
[INFO] [2019-10-15 18:17:53] Processing group of 4421 in 5 groups of 1000
[INFO] [2019-10-15 18:17:55] Average Time: 0.328
[INFO] [2019-10-15 18:17:55] Total Time: 2s
[INFO] [2019-10-15 18:17:55] Storing 4421 Nodes
[INFO] [2019-10-15 18:17:55] Processing group of 4421 in 5 groups of 1000
[INFO] [2019-10-15 18:17:56] Average Time: 0.274
[INFO] [2019-10-15 18:17:56] Total Time: 2s
[INFO] [2019-10-15 18:17:56] Storing 2410 Occurrences
[INFO] [2019-10-15 18:17:56] Processing group of 2410 in 3 groups of 1000
[INFO] [2019-10-15 18:17:57] Average Time: 0.083
[INFO] [2019-10-15 18:17:57] Total Time: 1s
[INFO] [2019-10-15 18:17:57] Storing 5200 TraitsReferences
[INFO] [2019-10-15 18:17:57] Processing group of 5200 in 6 groups of 1000
[INFO] [2019-10-15 18:17:57] Average Time: 0.085
[INFO] [2019-10-15 18:17:57] Total Time: 1s
[INFO] [2019-10-15 18:17:57] Storing 5199 Traits
[INFO] [2019-10-15 18:17:57] Processing group of 5199 in 6 groups of 1000
[INFO] [2019-10-15 18:17:59] Average Time: 0.272
[INFO] [2019-10-15 18:17:59] Total Time: 2s
[INFO] [2019-10-15 18:17:59] Storing 5198 MetaTraits
[INFO] [2019-10-15 18:17:59] Processing group of 5198 in 6 groups of 1000
[INFO] [2019-10-15 18:17:59] Average Time: 0.115
[INFO] [2019-10-15 18:17:59] Total Time: 1s
[STOP] [2019-10-15 18:18:00] parse_diff_and_store
[START] [2019-10-15 18:18:00] resolve_keys
[INFO] [2019-10-15 18:18:20] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-15 18:18:22] traits to occurrences...
[INFO] [2019-10-15 18:18:23] traits to nodes (through occurrences)...
[INFO] [2019-10-15 18:18:23] Traits to sex term...
[INFO] [2019-10-15 18:18:24] Traits to lifestage term...
[INFO] [2019-10-15 18:18:25] MetaTraits to traits...
[INFO] [2019-10-15 18:18:25] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-15 18:18:26] Assocs to occurrences...
[INFO] [2019-10-15 18:18:26] Assocs to nodes...
[INFO] [2019-10-15 18:18:26] Assoc to sex term...
[INFO] [2019-10-15 18:18:26] Assoc to lifestage term...
[STOP] [2019-10-15 18:18:26] resolve_keys
[START] [2019-10-15 18:18:26] hold_for_later_1
[STOP] [2019-10-15 18:18:26] hold_for_later_1
[START] [2019-10-15 18:18:26] hold_for_later_2
[STOP] [2019-10-15 18:18:26] hold_for_later_2
[START] [2019-10-15 18:18:26] resolve_missing_parents
[STOP] [2019-10-15 18:18:35] resolve_missing_parents
[START] [2019-10-15 18:18:35] rebuild_nodes
[START] [2019-10-15 18:18:35] Flattener#flatten
[START] [2019-10-15 18:18:35] Flattener#study_resource
[START] [2019-10-15 18:18:35] Flattener#build_ancestry
[STOP] [2019-10-15 18:18:36] Flattener#build_ancestry
[INFO] [2019-10-15 18:18:36] 4421 ancestry keys
[START] [2019-10-15 18:18:36] build_node_ancestors
[INFO] [2019-10-15 18:18:36] old ancestors deleted.
[STOP] [2019-10-15 18:18:37] build_node_ancestors
[START] [2019-10-15 18:18:38] Flattener#propagate_ancestor_ids
[STOP] [2019-10-15 18:18:38] Flattener#propagate_ancestor_ids
[STOP] [2019-10-15 18:18:38] Flattener#flatten
[STOP] [2019-10-15 18:18:38] rebuild_nodes
[START] [2019-10-15 18:18:38] resolve_missing_media_owners
[STOP] [2019-10-15 18:18:38] resolve_missing_media_owners
[START] [2019-10-15 18:18:38] sanitize_media_verbatims
[STOP] [2019-10-15 18:18:38] sanitize_media_verbatims
[START] [2019-10-15 18:18:38] queue_downloads
[STOP] [2019-10-15 18:18:38] queue_downloads
[START] [2019-10-15 18:18:38] parse_names
[WARN] [2019-10-15 18:18:38] I see 4421 names which still need to be parsed.
[STOP] [2019-10-15 18:18:42] parse_names
[START] [2019-10-15 18:18:42] denormalize_canonical_names_to_nodes
[STOP] [2019-10-15 18:18:42] denormalize_canonical_names_to_nodes
[START] [2019-10-15 18:18:42] match_nodes
[START] [2019-10-15 18:18:42] map_all_nodes_to_pages
[STOP] [2019-10-15 18:23:39] map_all_nodes_to_pages
[INFO] [2019-10-15 18:23:39] 178 Unmatched nodes (of 4421)! That's too many to output. First 10: Pennisetum uniseta (#51805780); Loxodera strigosum (#51803162); Sporobolus strictus (#51806276); Schizachyrium lomaensis (#51804431); Schizachyrium minimus (#51806007); Ischaemum fasciculatum (#51805259); Cenchrus polystachyon (#51805856); Cyperus halpan (#51803069); Nemum angolensis (#51804598); Kyllinga pumila (#51803844)
[START] [2019-10-15 18:23:39] update_nodes
[STOP] [2019-10-15 18:23:41] update_nodes
[STOP] [2019-10-15 18:23:41] match_nodes
[START] [2019-10-15 18:23:41] reindex_search
[STOP] [2019-10-15 18:23:57] reindex_search
[START] [2019-10-15 18:23:57] normalize_units
[STOP] [2019-10-15 18:23:57] normalize_units
[START] [2019-10-15 18:23:57] calculate_statistics
[STOP] [2019-10-15 18:23:57] calculate_statistics
[START] [2019-10-15 18:23:57] complete_harvest_instance
[START] [2019-10-15 18:23:57] overall_tsv_creation
[INFO] [2019-10-15 18:23:58] Processing group of 4421 in 1 batches of 10000
[INFO] [2019-10-15 18:25:02] 2410 Traits (unfiltered)...
[INFO] [2019-10-15 18:25:16] 2410 Traits (filtered)...
[INFO] [2019-10-15 18:25:16] 0 Associations (filtered)...
[INFO] [2019-10-15 18:25:59] 12048 metadata added.
[INFO] [2019-10-15 18:25:59] 0 metadata added.
[INFO] [2019-10-15 18:25:59] Average Time: 98.5
[INFO] [2019-10-15 18:25:59] Total Time: 2m2s
[STOP] [2019-10-15 18:25:59] overall_tsv_creation
[INFO] [2019-10-15 18:25:59] Done. Check your files:
[INFO] [2019-10-15 18:25:59] (4421 lines) /app/public/data/sierra_leone_sp_/publish_nodes.tsv
[INFO] [2019-10-15 18:26:00] (10376 lines) /app/public/data/sierra_leone_sp_/publish_node_ancestors.tsv
[INFO] [2019-10-15 18:26:00] (4421 lines) /app/public/data/sierra_leone_sp_/publish_scientific_names.tsv
[INFO] [2019-10-15 18:26:00] (2411 lines) /app/public/data/sierra_leone_sp_/publish_traits.tsv
[INFO] [2019-10-15 18:26:01] (12049 lines) /app/public/data/sierra_leone_sp_/publish_metadata.tsv
[STOP] [2019-10-15 18:26:01] complete_harvest_instance
[START] [2019-10-15 18:26:01] completed
[STOP] [2019-10-15 18:26:01] completed
[STOP] [2019-10-15 18:26:01] logged process, took 511.38
Latest Process