Stage:
completed
Fetched:
25 Dec 09:09
Validated:
25 Dec 09:09
Deltas Created
25 Dec 09:09
Units Normalized:
25 Dec 13:36
Ancestry Built:
25 Dec 09:15
Nodes Matched:
25 Dec 12:00
Names Parsed:
25 Dec 09:15
New Models Stored:
25 Dec 09:12
Indexed:
25 Dec 13:36
Completed:
25 Dec 13:43
Time to Harvest:
less than a minute
Harvesting Log
(194 lines)
# Logfile created on 2019-12-25 09:09:14 -0500 by logger.rb/56815
[START] [2019-12-25 09:09:14] logged process
[START] [2019-12-25 09:09:15] create_harvest_instance
[STOP] [2019-12-25 09:09:15] create_harvest_instance
[START] [2019-12-25 09:09:15] fetch_files
[STOP] [2019-12-25 09:09:15] fetch_files
[START] [2019-12-25 09:09:15] validate_each_file
[STOP] [2019-12-25 09:09:19] validate_each_file
[START] [2019-12-25 09:09:19] convert_to_csv
[CMD] [2019-12-25 09:09:19] /usr/bin/sort /app/public/converted_csv/philippine_sea_s_refs_19586.csv > /app/public/converted_csv/philippine_sea_s_refs_19586.csv_sorted
[CMD] [2019-12-25 09:09:19] /usr/bin/sort /app/public/converted_csv/philippine_sea_s_nodes_19587.csv > /app/public/converted_csv/philippine_sea_s_nodes_19587.csv_sorted
[CMD] [2019-12-25 09:09:20] /usr/bin/sort /app/public/converted_csv/philippine_sea_s_occurrences_19588.csv > /app/public/converted_csv/philippine_sea_s_occurrences_19588.csv_sorted
[CMD] [2019-12-25 09:09:21] /usr/bin/sort /app/public/converted_csv/philippine_sea_s_measurements_19589.csv > /app/public/converted_csv/philippine_sea_s_measurements_19589.csv_sorted
[STOP] [2019-12-25 09:09:22] convert_to_csv
[START] [2019-12-25 09:09:22] calculate_delta
[CMD] [2019-12-25 09:09:22] echo "0a" > /app/public/diff/philippine_sea_s_refs_19586.diff
[CMD] [2019-12-25 09:09:22] tail -n +1 /app/public/converted_csv/philippine_sea_s_refs_19586.csv >> /app/public/diff/philippine_sea_s_refs_19586.diff
[CMD] [2019-12-25 09:09:23] echo "." >> /app/public/diff/philippine_sea_s_refs_19586.diff
[CMD] [2019-12-25 09:09:24] echo "0a" > /app/public/diff/philippine_sea_s_nodes_19587.diff
[CMD] [2019-12-25 09:09:24] tail -n +1 /app/public/converted_csv/philippine_sea_s_nodes_19587.csv >> /app/public/diff/philippine_sea_s_nodes_19587.diff
[CMD] [2019-12-25 09:09:25] echo "." >> /app/public/diff/philippine_sea_s_nodes_19587.diff
[CMD] [2019-12-25 09:09:26] echo "0a" > /app/public/diff/philippine_sea_s_occurrences_19588.diff
[CMD] [2019-12-25 09:09:26] tail -n +1 /app/public/converted_csv/philippine_sea_s_occurrences_19588.csv >> /app/public/diff/philippine_sea_s_occurrences_19588.diff
[CMD] [2019-12-25 09:09:27] echo "." >> /app/public/diff/philippine_sea_s_occurrences_19588.diff
[CMD] [2019-12-25 09:09:28] echo "0a" > /app/public/diff/philippine_sea_s_measurements_19589.diff
[CMD] [2019-12-25 09:09:28] tail -n +1 /app/public/converted_csv/philippine_sea_s_measurements_19589.csv >> /app/public/diff/philippine_sea_s_measurements_19589.diff
[CMD] [2019-12-25 09:09:29] echo "." >> /app/public/diff/philippine_sea_s_measurements_19589.diff
[STOP] [2019-12-25 09:09:30] calculate_delta
[START] [2019-12-25 09:09:30] parse_diff_and_store
[INFO] [2019-12-25 09:09:30] Loading refs diff file into memory (true lines)...
[INFO] [2019-12-25 09:09:31] Loading nodes diff file into memory (true lines)...
[INFO] [2019-12-25 09:09:42] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-12-25 09:09:45] Loading measurements diff file into memory (true lines)...
[INFO] [2019-12-25 09:11:24] Storing 2 References
[INFO] [2019-12-25 09:11:24] Processing group of 2 in 1 groups of 1000
[INFO] [2019-12-25 09:11:24] Average Time: 0.0
[INFO] [2019-12-25 09:11:24] Total Time: 1s
[INFO] [2019-12-25 09:11:24] Storing 24975 ScientificNames
[INFO] [2019-12-25 09:11:24] Processing group of 24975 in 25 groups of 1000
[INFO] [2019-12-25 09:11:34] Average Time: 0.391
[INFO] [2019-12-25 09:11:34] Total Time: 10s
[INFO] [2019-12-25 09:11:34] last 3 / first 3: 1.02
[INFO] [2019-12-25 09:11:34] Std.Dev: 0.03162277660168379; Max: 0.46
[INFO] [2019-12-25 09:11:34] Storing 24975 Nodes
[INFO] [2019-12-25 09:11:34] Processing group of 24975 in 25 groups of 1000
[INFO] [2019-12-25 09:11:43] Average Time: 0.355
[INFO] [2019-12-25 09:11:43] Total Time: 9s
[INFO] [2019-12-25 09:11:43] last 3 / first 3: 1.85
[INFO] [2019-12-25 09:11:43] Std.Dev: 0.09486832980505137; Max: 0.7
[INFO] [2019-12-25 09:11:43] Storing 16741 Occurrences
[INFO] [2019-12-25 09:11:43] Processing group of 16741 in 17 groups of 1000
[INFO] [2019-12-25 09:11:45] Average Time: 0.148
[INFO] [2019-12-25 09:11:45] Total Time: 3s
[INFO] [2019-12-25 09:11:45] last 3 / first 3: 0.39
[INFO] [2019-12-25 09:11:45] Std.Dev: 0.08366600265340755; Max: 0.44
[INFO] [2019-12-25 09:11:45] Storing 33482 TraitsReferences
[INFO] [2019-12-25 09:11:45] Processing group of 33482 in 34 groups of 1000
[INFO] [2019-12-25 09:11:48] Average Time: 0.079
[INFO] [2019-12-25 09:11:48] Total Time: 3s
[INFO] [2019-12-25 09:11:48] last 3 / first 3: 0.58
[INFO] [2019-12-25 09:11:48] Std.Dev: 0.03162277660168379; Max: 0.16
[INFO] [2019-12-25 09:11:48] Storing 33482 Traits
[INFO] [2019-12-25 09:11:48] Processing group of 33482 in 34 groups of 1000
[INFO] [2019-12-25 09:12:02] Average Time: 0.392
[INFO] [2019-12-25 09:12:02] Total Time: 14s
[INFO] [2019-12-25 09:12:02] last 3 / first 3: 0.7
[INFO] [2019-12-25 09:12:02] Std.Dev: 0.18708286933869708; Max: 1.4
[INFO] [2019-12-25 09:12:02] Storing 33470 MetaTraits
[INFO] [2019-12-25 09:12:02] Processing group of 33470 in 34 groups of 1000
[INFO] [2019-12-25 09:12:08] Average Time: 0.194
[INFO] [2019-12-25 09:12:08] Total Time: 7s
[INFO] [2019-12-25 09:12:08] last 3 / first 3: 0.83
[INFO] [2019-12-25 09:12:08] Std.Dev: 0.2449489742783178; Max: 1.57
[STOP] [2019-12-25 09:12:08] parse_diff_and_store
[START] [2019-12-25 09:12:08] resolve_keys
[INFO] [2019-12-25 09:13:25] Occurrences to nodes (through scientific_names)...
[INFO] [2019-12-25 09:13:32] traits to occurrences...
[INFO] [2019-12-25 09:13:40] traits to nodes (through occurrences)...
[INFO] [2019-12-25 09:13:41] Traits to sex term...
[INFO] [2019-12-25 09:13:48] Traits to lifestage term...
[INFO] [2019-12-25 09:13:56] MetaTraits to traits...
[INFO] [2019-12-25 09:13:58] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-12-25 09:14:03] Assocs to occurrences...
[INFO] [2019-12-25 09:14:03] Assocs to nodes...
[INFO] [2019-12-25 09:14:03] Assoc to sex term...
[INFO] [2019-12-25 09:14:03] Assoc to lifestage term...
[STOP] [2019-12-25 09:14:03] resolve_keys
[START] [2019-12-25 09:14:03] hold_for_later_1
[STOP] [2019-12-25 09:14:03] hold_for_later_1
[START] [2019-12-25 09:14:03] hold_for_later_2
[STOP] [2019-12-25 09:14:03] hold_for_later_2
[START] [2019-12-25 09:14:03] resolve_missing_parents
[STOP] [2019-12-25 09:14:46] resolve_missing_parents
[START] [2019-12-25 09:14:46] rebuild_nodes
[START] [2019-12-25 09:14:46] Flattener#flatten
[START] [2019-12-25 09:14:46] Flattener#study_resource
[START] [2019-12-25 09:14:46] Flattener#build_ancestry
[STOP] [2019-12-25 09:14:49] Flattener#build_ancestry
[INFO] [2019-12-25 09:14:49] 24975 ancestry keys
[START] [2019-12-25 09:14:49] build_node_ancestors
[INFO] [2019-12-25 09:14:49] old ancestors deleted.
[STOP] [2019-12-25 09:15:08] build_node_ancestors
[START] [2019-12-25 09:15:11] Flattener#propagate_ancestor_ids
[STOP] [2019-12-25 09:15:14] Flattener#propagate_ancestor_ids
[STOP] [2019-12-25 09:15:14] Flattener#flatten
[STOP] [2019-12-25 09:15:14] rebuild_nodes
[START] [2019-12-25 09:15:14] resolve_missing_media_owners
[STOP] [2019-12-25 09:15:14] resolve_missing_media_owners
[START] [2019-12-25 09:15:14] sanitize_media_verbatims
[STOP] [2019-12-25 09:15:14] sanitize_media_verbatims
[START] [2019-12-25 09:15:14] queue_downloads
[STOP] [2019-12-25 09:15:14] queue_downloads
[START] [2019-12-25 09:15:14] parse_names
[WARN] [2019-12-25 09:15:14] I see 24975 names which still need to be parsed.
[STOP] [2019-12-25 09:15:36] parse_names
[START] [2019-12-25 09:15:36] denormalize_canonical_names_to_nodes
[STOP] [2019-12-25 09:15:36] denormalize_canonical_names_to_nodes
[START] [2019-12-25 09:15:36] match_nodes
[START] [2019-12-25 09:15:36] map_all_nodes_to_pages
[STOP] [2019-12-25 11:59:57] map_all_nodes_to_pages
[INFO] [2019-12-25 11:59:57] 1385 Unmatched nodes (of 24975)! That's too many to output. First 10: Hemiaulus indicus (#62416433); Streptothecaceae (#62411564); Streptotheca (#62411563); Streptotheca thamensis (#62411562); Skeletonemaceae (#62399130); Detonula delicatula (#62404858); Thallassiosiraceae (#62399302); Thalassiosira leptopus (#62402217); Thalassiosira binata (#62411964); Lithodesmidales (#62399247)
[START] [2019-12-25 11:59:57] update_nodes
[STOP] [2019-12-25 12:00:07] update_nodes
[STOP] [2019-12-25 12:00:07] match_nodes
[START] [2019-12-25 12:00:07] reindex_search
[STOP] [2019-12-25 12:00:31] reindex_search
[ERR] [2019-12-25 12:00:31] Faraday::ConnectionFailed
[ERR] [2019-12-25 12:00:31] Failed to open TCP connection to elasticsearch:9200 (Connection refused - connect(2) for "elasticsearch" port 9200)
[ERR] [2019-12-25 12:00:31] ../models/resource_harvester.rb:615:in `reindex_search'
[ERR] [2019-12-25 12:00:31] ../models/resource_harvester.rb:86:in `block (3 levels) in start'
[ERR] [2019-12-25 12:00:31] ../models/logged_process.rb:19:in `run_step'
[ERR] [2019-12-25 12:00:31] ../models/resource_harvester.rb:86:in `block (2 levels) in start'
[ERR] [2019-12-25 12:00:31] ../models/resource_harvester.rb:75:in `each_key'
[ERR] [2019-12-25 12:00:31] ../models/resource_harvester.rb:75:in `block in start'
[ERR] [2019-12-25 12:00:31] ../models/resource.rb:139:in `lock'
[ERR] [2019-12-25 12:00:31] ../models/resource_harvester.rb:72:in `start'
[ERR] [2019-12-25 12:00:31] ../models/resource.rb:223:in `harvest'
[ERR] [2019-12-25 12:00:31] ../models/resource.rb:199:in `re_download_opendata_and_harvest'
[STOP] [2019-12-25 12:00:31] logged process, took 10276.93
[START] [2019-12-25 13:35:18] logged process
[INFO] [2019-12-25 13:35:18] Already completed stage create_harvest_instance, skipping...
[INFO] [2019-12-25 13:35:18] Already completed stage fetch_files, skipping...
[INFO] [2019-12-25 13:35:18] Already completed stage validate_each_file, skipping...
[INFO] [2019-12-25 13:35:18] Already completed stage convert_to_csv, skipping...
[INFO] [2019-12-25 13:35:18] Already completed stage calculate_delta, skipping...
[INFO] [2019-12-25 13:35:18] Already completed stage parse_diff_and_store, skipping...
[INFO] [2019-12-25 13:35:18] Already completed stage resolve_keys, skipping...
[INFO] [2019-12-25 13:35:18] Already completed stage hold_for_later_1, skipping...
[INFO] [2019-12-25 13:35:18] Already completed stage hold_for_later_2, skipping...
[INFO] [2019-12-25 13:35:18] Already completed stage resolve_missing_parents, skipping...
[INFO] [2019-12-25 13:35:18] Already completed stage rebuild_nodes, skipping...
[INFO] [2019-12-25 13:35:18] Already completed stage resolve_missing_media_owners, skipping...
[INFO] [2019-12-25 13:35:18] Already completed stage sanitize_media_verbatims, skipping...
[INFO] [2019-12-25 13:35:18] Already completed stage queue_downloads, skipping...
[INFO] [2019-12-25 13:35:18] Already completed stage parse_names, skipping...
[INFO] [2019-12-25 13:35:18] Already completed stage denormalize_canonical_names_to_nodes, skipping...
[INFO] [2019-12-25 13:35:18] Already completed stage match_nodes, skipping...
[START] [2019-12-25 13:35:18] reindex_search
[STOP] [2019-12-25 13:36:02] reindex_search
[START] [2019-12-25 13:36:02] normalize_units
[STOP] [2019-12-25 13:36:02] normalize_units
[START] [2019-12-25 13:36:02] calculate_statistics
[STOP] [2019-12-25 13:36:03] calculate_statistics
[START] [2019-12-25 13:36:03] complete_harvest_instance
[START] [2019-12-25 13:36:03] overall_tsv_creation
[INFO] [2019-12-25 13:36:03] Processing group of 24975 in 3 batches of 10000
[INFO] [2019-12-25 13:37:33] 6060 Traits (unfiltered)...
[INFO] [2019-12-25 13:37:48] 6060 Traits (filtered)...
[INFO] [2019-12-25 13:37:48] 0 Associations (filtered)...
[INFO] [2019-12-25 13:38:41] 30293 metadata added.
[INFO] [2019-12-25 13:38:41] 0 metadata added.
[INFO] [2019-12-25 13:40:20] 7001 Traits (unfiltered)...
[INFO] [2019-12-25 13:40:34] 7001 Traits (filtered)...
[INFO] [2019-12-25 13:40:34] 0 Associations (filtered)...
[INFO] [2019-12-25 13:41:34] 35003 metadata added.
[INFO] [2019-12-25 13:41:34] 0 metadata added.
[INFO] [2019-12-25 13:42:49] 3680 Traits (unfiltered)...
[INFO] [2019-12-25 13:43:02] 3680 Traits (filtered)...
[INFO] [2019-12-25 13:43:03] 0 Associations (filtered)...
[INFO] [2019-12-25 13:43:51] 18397 metadata added.
[INFO] [2019-12-25 13:43:51] 0 metadata added.
[INFO] [2019-12-25 13:43:51] Average Time: 126.857
[INFO] [2019-12-25 13:43:51] Total Time: 7m49s
[STOP] [2019-12-25 13:43:51] overall_tsv_creation
[INFO] [2019-12-25 13:43:51] Done. Check your files:
[INFO] [2019-12-25 13:43:51] (24975 lines) /app/public/data/philippine_sea_s/publish_nodes.tsv
[INFO] [2019-12-25 13:43:51] (136291 lines) /app/public/data/philippine_sea_s/publish_node_ancestors.tsv
[INFO] [2019-12-25 13:43:51] (24975 lines) /app/public/data/philippine_sea_s/publish_scientific_names.tsv
[INFO] [2019-12-25 13:43:51] (16742 lines) /app/public/data/philippine_sea_s/publish_traits.tsv
[INFO] [2019-12-25 13:43:51] (83694 lines) /app/public/data/philippine_sea_s/publish_metadata.tsv
[STOP] [2019-12-25 13:43:52] complete_harvest_instance
[START] [2019-12-25 13:43:52] completed
[STOP] [2019-12-25 13:43:52] completed
[STOP] [2019-12-25 13:43:52] logged process, took 513.55
Latest Process