Harvest for Kazakhstan Species List Created 13 Oct 23:58

Stage: completed
Fetched: 13 Oct 23:58
Validated: 13 Oct 23:58
Deltas Created 13 Oct 23:58
Units Normalized: 14 Oct 00:05
Ancestry Built: 13 Oct 23:59
Nodes Matched: 14 Oct 00:05
Names Parsed: 13 Oct 23:59
New Models Stored: 13 Oct 23:58
Indexed: 14 Oct 00:05
Completed: 14 Oct 00:07
Time to Harvest: less than a minute

Harvesting Log

(145 lines)
# Logfile created on 2019-10-13 23:58:19 -0400 by logger.rb/56815
[START] [2019-10-13 23:58:19] logged process
[START] [2019-10-13 23:58:19] create_harvest_instance
[STOP] [2019-10-13 23:58:19] create_harvest_instance
[START] [2019-10-13 23:58:19] fetch_files
[STOP] [2019-10-13 23:58:19] fetch_files
[START] [2019-10-13 23:58:19] validate_each_file
[STOP] [2019-10-13 23:58:20] validate_each_file
[START] [2019-10-13 23:58:20] convert_to_csv
[CMD] [2019-10-13 23:58:20] /usr/bin/sort /app/public/converted_csv/kazakhstan_sp_li_refs_16291.csv > /app/public/converted_csv/kazakhstan_sp_li_refs_16291.csv_sorted
[CMD] [2019-10-13 23:58:20] /usr/bin/sort /app/public/converted_csv/kazakhstan_sp_li_nodes_16292.csv > /app/public/converted_csv/kazakhstan_sp_li_nodes_16292.csv_sorted
[CMD] [2019-10-13 23:58:20] /usr/bin/sort /app/public/converted_csv/kazakhstan_sp_li_occurrences_16293.csv > /app/public/converted_csv/kazakhstan_sp_li_occurrences_16293.csv_sorted
[CMD] [2019-10-13 23:58:20] /usr/bin/sort /app/public/converted_csv/kazakhstan_sp_li_measurements_16294.csv > /app/public/converted_csv/kazakhstan_sp_li_measurements_16294.csv_sorted
[STOP] [2019-10-13 23:58:20] convert_to_csv
[START] [2019-10-13 23:58:20] calculate_delta
[CMD] [2019-10-13 23:58:20] echo "0a" > /app/public/diff/kazakhstan_sp_li_refs_16291.diff
[CMD] [2019-10-13 23:58:20] tail -n +1 /app/public/converted_csv/kazakhstan_sp_li_refs_16291.csv >> /app/public/diff/kazakhstan_sp_li_refs_16291.diff
[CMD] [2019-10-13 23:58:20] echo "." >> /app/public/diff/kazakhstan_sp_li_refs_16291.diff
[CMD] [2019-10-13 23:58:20] echo "0a" > /app/public/diff/kazakhstan_sp_li_nodes_16292.diff
[CMD] [2019-10-13 23:58:20] tail -n +1 /app/public/converted_csv/kazakhstan_sp_li_nodes_16292.csv >> /app/public/diff/kazakhstan_sp_li_nodes_16292.diff
[CMD] [2019-10-13 23:58:21] echo "." >> /app/public/diff/kazakhstan_sp_li_nodes_16292.diff
[CMD] [2019-10-13 23:58:21] echo "0a" > /app/public/diff/kazakhstan_sp_li_occurrences_16293.diff
[CMD] [2019-10-13 23:58:21] tail -n +1 /app/public/converted_csv/kazakhstan_sp_li_occurrences_16293.csv >> /app/public/diff/kazakhstan_sp_li_occurrences_16293.diff
[CMD] [2019-10-13 23:58:21] echo "." >> /app/public/diff/kazakhstan_sp_li_occurrences_16293.diff
[CMD] [2019-10-13 23:58:21] echo "0a" > /app/public/diff/kazakhstan_sp_li_measurements_16294.diff
[CMD] [2019-10-13 23:58:21] tail -n +1 /app/public/converted_csv/kazakhstan_sp_li_measurements_16294.csv >> /app/public/diff/kazakhstan_sp_li_measurements_16294.diff
[CMD] [2019-10-13 23:58:21] echo "." >> /app/public/diff/kazakhstan_sp_li_measurements_16294.diff
[STOP] [2019-10-13 23:58:21] calculate_delta
[START] [2019-10-13 23:58:21] parse_diff_and_store
[INFO] [2019-10-13 23:58:21] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-13 23:58:21] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-13 23:58:23] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-13 23:58:24] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-13 23:58:41] Storing 2 References
[INFO] [2019-10-13 23:58:41] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-13 23:58:41] Average Time: 0.0
[INFO] [2019-10-13 23:58:41] Total Time: 1s
[INFO] [2019-10-13 23:58:41] Storing 5547 ScientificNames
[INFO] [2019-10-13 23:58:41] Processing group of 5547 in 6 groups of 1000
[INFO] [2019-10-13 23:58:44] Average Time: 0.37
[INFO] [2019-10-13 23:58:44] Total Time: 3s
[INFO] [2019-10-13 23:58:44] Storing 5547 Nodes
[INFO] [2019-10-13 23:58:44] Processing group of 5547 in 6 groups of 1000
[INFO] [2019-10-13 23:58:45] Average Time: 0.278
[INFO] [2019-10-13 23:58:45] Total Time: 2s
[INFO] [2019-10-13 23:58:45] Storing 2892 Occurrences
[INFO] [2019-10-13 23:58:45] Processing group of 2892 in 3 groups of 1000
[INFO] [2019-10-13 23:58:46] Average Time: 0.103
[INFO] [2019-10-13 23:58:46] Total Time: 1s
[INFO] [2019-10-13 23:58:46] Storing 6120 TraitsReferences
[INFO] [2019-10-13 23:58:46] Processing group of 6120 in 7 groups of 1000
[INFO] [2019-10-13 23:58:46] Average Time: 0.074
[INFO] [2019-10-13 23:58:46] Total Time: 1s
[INFO] [2019-10-13 23:58:46] last 3 / first 3: 0.52
[INFO] [2019-10-13 23:58:46] Std.Dev: 0.044721359549995794; Max: 0.15
[INFO] [2019-10-13 23:58:46] Storing 6119 Traits
[INFO] [2019-10-13 23:58:46] Processing group of 6119 in 7 groups of 1000
[INFO] [2019-10-13 23:58:48] Average Time: 0.276
[INFO] [2019-10-13 23:58:48] Total Time: 2s
[INFO] [2019-10-13 23:58:48] last 3 / first 3: 0.59
[INFO] [2019-10-13 23:58:48] Std.Dev: 0.11401754250991379; Max: 0.4
[INFO] [2019-10-13 23:58:48] Storing 6104 MetaTraits
[INFO] [2019-10-13 23:58:48] Processing group of 6104 in 7 groups of 1000
[INFO] [2019-10-13 23:58:49] Average Time: 0.11
[INFO] [2019-10-13 23:58:49] Total Time: 1s
[INFO] [2019-10-13 23:58:49] last 3 / first 3: 0.69
[INFO] [2019-10-13 23:58:49] Std.Dev: 0.044721359549995794; Max: 0.16
[STOP] [2019-10-13 23:58:49] parse_diff_and_store
[START] [2019-10-13 23:58:49] resolve_keys
[INFO] [2019-10-13 23:59:13] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-13 23:59:15] traits to occurrences...
[INFO] [2019-10-13 23:59:17] traits to nodes (through occurrences)...
[INFO] [2019-10-13 23:59:17] Traits to sex term...
[INFO] [2019-10-13 23:59:18] Traits to lifestage term...
[INFO] [2019-10-13 23:59:19] MetaTraits to traits...
[INFO] [2019-10-13 23:59:20] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-13 23:59:20] Assocs to occurrences...
[INFO] [2019-10-13 23:59:20] Assocs to nodes...
[INFO] [2019-10-13 23:59:20] Assoc to sex term...
[INFO] [2019-10-13 23:59:20] Assoc to lifestage term...
[STOP] [2019-10-13 23:59:20] resolve_keys
[START] [2019-10-13 23:59:20] hold_for_later_1
[STOP] [2019-10-13 23:59:20] hold_for_later_1
[START] [2019-10-13 23:59:20] hold_for_later_2
[STOP] [2019-10-13 23:59:20] hold_for_later_2
[START] [2019-10-13 23:59:20] resolve_missing_parents
[STOP] [2019-10-13 23:59:32] resolve_missing_parents
[START] [2019-10-13 23:59:32] rebuild_nodes
[START] [2019-10-13 23:59:32] Flattener#flatten
[START] [2019-10-13 23:59:32] Flattener#study_resource
[START] [2019-10-13 23:59:32] Flattener#build_ancestry
[STOP] [2019-10-13 23:59:33] Flattener#build_ancestry
[INFO] [2019-10-13 23:59:33] 5547 ancestry keys
[START] [2019-10-13 23:59:33] build_node_ancestors
[INFO] [2019-10-13 23:59:33] old ancestors deleted.
[STOP] [2019-10-13 23:59:34] build_node_ancestors
[START] [2019-10-13 23:59:35] Flattener#propagate_ancestor_ids
[STOP] [2019-10-13 23:59:36] Flattener#propagate_ancestor_ids
[STOP] [2019-10-13 23:59:36] Flattener#flatten
[STOP] [2019-10-13 23:59:36] rebuild_nodes
[START] [2019-10-13 23:59:36] resolve_missing_media_owners
[STOP] [2019-10-13 23:59:36] resolve_missing_media_owners
[START] [2019-10-13 23:59:36] sanitize_media_verbatims
[STOP] [2019-10-13 23:59:36] sanitize_media_verbatims
[START] [2019-10-13 23:59:36] queue_downloads
[STOP] [2019-10-13 23:59:36] queue_downloads
[START] [2019-10-13 23:59:36] parse_names
[WARN] [2019-10-13 23:59:36] I see 5547 names which still need to be parsed.
[STOP] [2019-10-13 23:59:41] parse_names
[START] [2019-10-13 23:59:41] denormalize_canonical_names_to_nodes
[STOP] [2019-10-13 23:59:41] denormalize_canonical_names_to_nodes
[START] [2019-10-13 23:59:41] match_nodes
[START] [2019-10-13 23:59:41] map_all_nodes_to_pages
[STOP] [2019-10-14 00:05:23] map_all_nodes_to_pages
[INFO] [2019-10-14 00:05:23] 648 Unmatched nodes (of 5547)! That's too many to output. First 10: Magnoliopsida (#50482828); Pyrus hybrida (#50484907); Crataegus korolkowi (#50488319); Tamarix polystachya (#50486759); Arenaria longifolia (#50488292); Silene dianthifolia (#50486477); Silene leptopetala (#50487146); Melilotus alba (#50483645); Astragalus bracteosa (#50486668); Astragalus eremothamnus (#50486846)
[START] [2019-10-14 00:05:23] update_nodes
[STOP] [2019-10-14 00:05:25] update_nodes
[STOP] [2019-10-14 00:05:25] match_nodes
[START] [2019-10-14 00:05:25] reindex_search
[STOP] [2019-10-14 00:05:39] reindex_search
[START] [2019-10-14 00:05:39] normalize_units
[STOP] [2019-10-14 00:05:39] normalize_units
[START] [2019-10-14 00:05:39] calculate_statistics
[STOP] [2019-10-14 00:05:39] calculate_statistics
[START] [2019-10-14 00:05:39] complete_harvest_instance
[START] [2019-10-14 00:05:39] overall_tsv_creation
[INFO] [2019-10-14 00:05:39] Processing group of 5547 in 1 batches of 10000
[INFO] [2019-10-14 00:06:48] 2892 Traits (unfiltered)...
[INFO] [2019-10-14 00:07:02] 2892 Traits (filtered)...
[INFO] [2019-10-14 00:07:02] 0 Associations (filtered)...
[INFO] [2019-10-14 00:07:46] 14447 metadata added.
[INFO] [2019-10-14 00:07:46] 0 metadata added.
[INFO] [2019-10-14 00:07:46] Average Time: 102.46
[INFO] [2019-10-14 00:07:46] Total Time: 2m8s
[STOP] [2019-10-14 00:07:46] overall_tsv_creation
[INFO] [2019-10-14 00:07:46] Done. Check your files:
[INFO] [2019-10-14 00:07:46] (5547 lines) /app/public/data/kazakhstan_sp_li/publish_nodes.tsv
[INFO] [2019-10-14 00:07:46] (17707 lines) /app/public/data/kazakhstan_sp_li/publish_node_ancestors.tsv
[INFO] [2019-10-14 00:07:46] (5547 lines) /app/public/data/kazakhstan_sp_li/publish_scientific_names.tsv
[INFO] [2019-10-14 00:07:46] (2893 lines) /app/public/data/kazakhstan_sp_li/publish_traits.tsv
[INFO] [2019-10-14 00:07:46] (14448 lines) /app/public/data/kazakhstan_sp_li/publish_metadata.tsv
[STOP] [2019-10-14 00:07:46] complete_harvest_instance
[START] [2019-10-14 00:07:46] completed
[STOP] [2019-10-14 00:07:46] completed
[STOP] [2019-10-14 00:07:46] logged process, took 567.88

Latest Process