Harvest for 3I: Deltocephalinae Created 28 Jan 10:15

Stage: completed
Fetched: 28 Jan 10:15
Validated: 28 Jan 10:15
Deltas Created 28 Jan 10:15
Units Normalized: 28 Jan 10:28
Ancestry Built: 28 Jan 10:16
Nodes Matched: 28 Jan 10:28
Names Parsed: 28 Jan 10:17
New Models Stored: 28 Jan 10:16
Indexed: 28 Jan 10:28
Completed: 28 Jan 10:32
Time to Harvest: less than a minute

Harvesting Log

(261 lines)
# Logfile created on 2021-01-22 14:35:06 -0500 by logger.rb/v1.4.2
[START] [2021-01-22 14:35:06] logged process: ca5be136aef877c71c74100a42de34a9e7a07645

[START] [2021-01-22 14:35:06] Creating resource from OpenData
[START] [2021-01-22 14:35:07] logged process: ca5be136aef877c71c74100a42de34a9e7a07645

[START] [2021-01-22 14:35:07] Parse meta.xml file and create formats with fields
[STOP] [2021-01-22 14:35:07] Parse meta.xml file and create formats with fields
[STOP] [2021-01-22 14:35:07] Creating resource from OpenData
[INFO] [2021-01-28 10:15:35] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2021-01-28 10:15:38] ## remove_type: ScientificName
[INFO] [2021-01-28 10:15:38] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 10:15:38] [10:15:38.243] Removed 0 Scientificnames
[INFO] [2021-01-28 10:15:38] ## remove_type: Vernacular
[INFO] [2021-01-28 10:15:38] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 10:15:38] [10:15:38.246] Removed 0 Vernaculars
[INFO] [2021-01-28 10:15:38] ## remove_type: Article
[INFO] [2021-01-28 10:15:38] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 10:15:38] [10:15:38.249] Removed 0 Articles
[INFO] [2021-01-28 10:15:38] ## remove_type: Medium
[INFO] [2021-01-28 10:15:38] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 10:15:38] [10:15:38.253] Removed 0 Media
[INFO] [2021-01-28 10:15:38] ## remove_type: Trait
[INFO] [2021-01-28 10:15:38] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 10:15:38] [10:15:38.256] Removed 0 Traits
[INFO] [2021-01-28 10:15:38] ## remove_type: MetaTrait
[INFO] [2021-01-28 10:15:38] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 10:15:38] [10:15:38.259] Removed 0 Metatraits
[INFO] [2021-01-28 10:15:38] ## remove_type: OccurrenceMetadatum
[INFO] [2021-01-28 10:15:38] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 10:15:38] [10:15:38.262] Removed 0 Occurrencemetadata
[INFO] [2021-01-28 10:15:38] ## remove_type: Assoc
[INFO] [2021-01-28 10:15:38] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 10:15:38] [10:15:38.265] Removed 0 Assocs
[INFO] [2021-01-28 10:15:38] ## remove_type: MetaAssoc
[INFO] [2021-01-28 10:15:38] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 10:15:38] [10:15:38.268] Removed 0 Metaassocs
[INFO] [2021-01-28 10:15:38] ## remove_type: Identifier
[INFO] [2021-01-28 10:15:38] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 10:15:38] [10:15:38.271] Removed 0 Identifiers
[INFO] [2021-01-28 10:15:38] ## remove_type: Reference
[INFO] [2021-01-28 10:15:38] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 10:15:38] [10:15:38.273] Removed 0 References
[INFO] [2021-01-28 10:15:38] ## remove_type: Node
[INFO] [2021-01-28 10:15:38] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 10:15:38] [10:15:38.294] Removed 0 Nodes
[START] [2021-01-28 10:15:38] logged process: e38687fde360d9ce2f48dd55c3bf56ebb9d8efa6

[START] [2021-01-28 10:15:38] Creating resource from OpenData
[START] [2021-01-28 10:15:38] logged process: e38687fde360d9ce2f48dd55c3bf56ebb9d8efa6

[START] [2021-01-28 10:15:38] Parse meta.xml file and create formats with fields
[WARN] [2021-01-28 10:15:38] (common) IGNORED  (refs) field header: publicationType term: http://eol.org/schema/reference/publicationType
[WARN] [2021-01-28 10:15:38] (common) IGNORED  (nodes) field header: scientificNameAuthorship term: http://rs.tdwg.org/dwc/terms/scientificNameAuthorship
[WARN] [2021-01-28 10:15:38] (common) IGNORED  (nodes) field header: nomenclaturalCode term: http://rs.tdwg.org/dwc/terms/nomenclaturalCode
[WARN] [2021-01-28 10:15:38] (common) IGNORED  (nodes) field header: nomenclaturalStatus term: http://rs.tdwg.org/dwc/terms/nomenclaturalStatus
[STOP] [2021-01-28 10:15:38] Parse meta.xml file and create formats with fields
[STOP] [2021-01-28 10:15:38] Creating resource from OpenData
[START] [2021-01-28 10:15:38] logged process: e38687fde360d9ce2f48dd55c3bf56ebb9d8efa6

[START] [2021-01-28 10:15:38] create_harvest_instance
[STOP] [2021-01-28 10:15:42] create_harvest_instance
[START] [2021-01-28 10:15:42] fetch_files
[STOP] [2021-01-28 10:15:42] fetch_files
[START] [2021-01-28 10:15:42] validate_each_file
[STOP] [2021-01-28 10:15:44] validate_each_file
[START] [2021-01-28 10:15:44] convert_to_csv
[CMD] [2021-01-28 10:15:44] /usr/bin/sort /app/public/converted_csv/deltocephalinae_agents_26239.csv > /app/public/converted_csv/deltocephalinae_agents_26239.csv_sorted
[CMD] [2021-01-28 10:15:44] /usr/bin/sort /app/public/converted_csv/deltocephalinae_refs_26240.csv > /app/public/converted_csv/deltocephalinae_refs_26240.csv_sorted
[CMD] [2021-01-28 10:15:44] /usr/bin/sort /app/public/converted_csv/deltocephalinae_nodes_26241.csv > /app/public/converted_csv/deltocephalinae_nodes_26241.csv_sorted
[CMD] [2021-01-28 10:15:44] /usr/bin/sort /app/public/converted_csv/deltocephalinae_media_26242.csv > /app/public/converted_csv/deltocephalinae_media_26242.csv_sorted
[CMD] [2021-01-28 10:15:44] /usr/bin/sort /app/public/converted_csv/deltocephalinae_vernaculars_26243.csv > /app/public/converted_csv/deltocephalinae_vernaculars_26243.csv_sorted
[CMD] [2021-01-28 10:15:44] /usr/bin/sort /app/public/converted_csv/deltocephalinae_occurrences_26244.csv > /app/public/converted_csv/deltocephalinae_occurrences_26244.csv_sorted
[CMD] [2021-01-28 10:15:44] /usr/bin/sort /app/public/converted_csv/deltocephalinae_measurements_26245.csv > /app/public/converted_csv/deltocephalinae_measurements_26245.csv_sorted
[STOP] [2021-01-28 10:15:44] convert_to_csv
[START] [2021-01-28 10:15:44] calculate_delta
[CMD] [2021-01-28 10:15:44] echo "0a" > /app/public/diff/deltocephalinae_agents_26239.diff
[CMD] [2021-01-28 10:15:44] tail -n +1 /app/public/converted_csv/deltocephalinae_agents_26239.csv >> /app/public/diff/deltocephalinae_agents_26239.diff
[CMD] [2021-01-28 10:15:44] echo "." >> /app/public/diff/deltocephalinae_agents_26239.diff
[CMD] [2021-01-28 10:15:44] echo "0a" > /app/public/diff/deltocephalinae_refs_26240.diff
[CMD] [2021-01-28 10:15:44] tail -n +1 /app/public/converted_csv/deltocephalinae_refs_26240.csv >> /app/public/diff/deltocephalinae_refs_26240.diff
[CMD] [2021-01-28 10:15:44] echo "." >> /app/public/diff/deltocephalinae_refs_26240.diff
[CMD] [2021-01-28 10:15:44] echo "0a" > /app/public/diff/deltocephalinae_nodes_26241.diff
[CMD] [2021-01-28 10:15:44] tail -n +1 /app/public/converted_csv/deltocephalinae_nodes_26241.csv >> /app/public/diff/deltocephalinae_nodes_26241.diff
[CMD] [2021-01-28 10:15:44] echo "." >> /app/public/diff/deltocephalinae_nodes_26241.diff
[CMD] [2021-01-28 10:15:44] echo "0a" > /app/public/diff/deltocephalinae_media_26242.diff
[CMD] [2021-01-28 10:15:44] tail -n +1 /app/public/converted_csv/deltocephalinae_media_26242.csv >> /app/public/diff/deltocephalinae_media_26242.diff
[CMD] [2021-01-28 10:15:44] echo "." >> /app/public/diff/deltocephalinae_media_26242.diff
[CMD] [2021-01-28 10:15:44] echo "0a" > /app/public/diff/deltocephalinae_vernaculars_26243.diff
[CMD] [2021-01-28 10:15:44] tail -n +1 /app/public/converted_csv/deltocephalinae_vernaculars_26243.csv >> /app/public/diff/deltocephalinae_vernaculars_26243.diff
[CMD] [2021-01-28 10:15:44] echo "." >> /app/public/diff/deltocephalinae_vernaculars_26243.diff
[CMD] [2021-01-28 10:15:44] echo "0a" > /app/public/diff/deltocephalinae_occurrences_26244.diff
[CMD] [2021-01-28 10:15:44] tail -n +1 /app/public/converted_csv/deltocephalinae_occurrences_26244.csv >> /app/public/diff/deltocephalinae_occurrences_26244.diff
[CMD] [2021-01-28 10:15:44] echo "." >> /app/public/diff/deltocephalinae_occurrences_26244.diff
[CMD] [2021-01-28 10:15:44] echo "0a" > /app/public/diff/deltocephalinae_measurements_26245.diff
[CMD] [2021-01-28 10:15:44] tail -n +1 /app/public/converted_csv/deltocephalinae_measurements_26245.csv >> /app/public/diff/deltocephalinae_measurements_26245.diff
[CMD] [2021-01-28 10:15:44] echo "." >> /app/public/diff/deltocephalinae_measurements_26245.diff
[STOP] [2021-01-28 10:15:44] calculate_delta
[START] [2021-01-28 10:15:44] parse_diff_and_store
[INFO] [2021-01-28 10:15:44] Loading agents diff file into memory (true lines)...
[INFO] [2021-01-28 10:15:45] Loading refs diff file into memory (true lines)...
[INFO] [2021-01-28 10:15:45] Loading nodes diff file into memory (true lines)...
[WARN] [2021-01-28 10:15:45] Filtered Scientific Name `Macrosteles (brevis [NEEDS A NOM NOV!!!!///])` to `Macrosteles (brevis [NEEDS A NOM NOV!!!!])`
[WARN] [2021-01-28 10:15:45] Filtered Scientific Name `Thamnotettix biguttatus  domino` to `Thamnotettix biguttatus domino`
[WARN] [2021-01-28 10:15:46] Filtered Scientific Name `Gillettiella labiata  var. gillettei` to `Gillettiella labiata var. gillettei`
[INFO] [2021-01-28 10:15:49] Loading media diff file into memory (true lines)...
[INFO] [2021-01-28 10:15:51] Loading vernaculars diff file into memory (true lines)...
[INFO] [2021-01-28 10:15:51] Loading occurrences diff file into memory (true lines)...
[INFO] [2021-01-28 10:15:54] Loading measurements diff file into memory (true lines)...
[INFO] [2021-01-28 10:16:00] Storing 464 Attributions
[INFO] [2021-01-28 10:16:00] Processing group of 464 in 1 groups of 1000
[INFO] [2021-01-28 10:16:00] Average Time: 0.08
[INFO] [2021-01-28 10:16:00] Total Time: 1s
[INFO] [2021-01-28 10:16:00] Storing 904 References
[INFO] [2021-01-28 10:16:00] Processing group of 904 in 1 groups of 1000
[INFO] [2021-01-28 10:16:01] Average Time: 0.18
[INFO] [2021-01-28 10:16:01] Total Time: 1s
[INFO] [2021-01-28 10:16:01] Storing 16779 ScientificNames
[INFO] [2021-01-28 10:16:01] Processing group of 16779 in 17 groups of 1000
[INFO] [2021-01-28 10:16:06] Average Time: 0.313
[INFO] [2021-01-28 10:16:06] Total Time: 6s
[INFO] [2021-01-28 10:16:06] last 3 / first 3: 0.97
[INFO] [2021-01-28 10:16:06] Std.Dev: 0.0; Max: 0.36
[INFO] [2021-01-28 10:16:06] Storing 8267 Nodes
[INFO] [2021-01-28 10:16:06] Processing group of 8267 in 9 groups of 1000
[INFO] [2021-01-28 10:16:09] Average Time: 0.274
[INFO] [2021-01-28 10:16:09] Total Time: 3s
[INFO] [2021-01-28 10:16:09] last 3 / first 3: 1.16
[INFO] [2021-01-28 10:16:09] Std.Dev: 0.08366600265340755; Max: 0.46
[INFO] [2021-01-28 10:16:09] Storing 346 NodesReferences
[INFO] [2021-01-28 10:16:09] Processing group of 346 in 1 groups of 1000
[INFO] [2021-01-28 10:16:09] Average Time: 0.03
[INFO] [2021-01-28 10:16:09] Total Time: 1s
[INFO] [2021-01-28 10:16:09] Storing 3798 ContentAttributions
[INFO] [2021-01-28 10:16:09] Processing group of 3798 in 4 groups of 1000
[INFO] [2021-01-28 10:16:09] Average Time: 0.15
[INFO] [2021-01-28 10:16:09] Total Time: 1s
[INFO] [2021-01-28 10:16:09] Storing 1899 Media
[INFO] [2021-01-28 10:16:09] Processing group of 1899 in 2 groups of 1000
[INFO] [2021-01-28 10:16:10] Average Time: 0.47
[INFO] [2021-01-28 10:16:10] Total Time: 1s
[INFO] [2021-01-28 10:16:10] Storing 3 Vernaculars
[INFO] [2021-01-28 10:16:10] Processing group of 3 in 1 groups of 1000
[INFO] [2021-01-28 10:16:10] Average Time: 0.0
[INFO] [2021-01-28 10:16:10] Total Time: 1s
[INFO] [2021-01-28 10:16:10] Storing 11543 Occurrences
[INFO] [2021-01-28 10:16:10] Processing group of 11543 in 12 groups of 1000
[INFO] [2021-01-28 10:16:11] Average Time: 0.1
[INFO] [2021-01-28 10:16:11] Total Time: 2s
[INFO] [2021-01-28 10:16:11] last 3 / first 3: 0.71
[INFO] [2021-01-28 10:16:11] Std.Dev: 0.0; Max: 0.15
[INFO] [2021-01-28 10:16:11] Storing 23086 OccurrenceMetadata
[INFO] [2021-01-28 10:16:11] Processing group of 23086 in 24 groups of 1000
[INFO] [2021-01-28 10:16:14] Average Time: 0.114
[INFO] [2021-01-28 10:16:14] Total Time: 3s
[INFO] [2021-01-28 10:16:14] last 3 / first 3: 0.63
[INFO] [2021-01-28 10:16:14] Std.Dev: 0.03162277660168379; Max: 0.16
[INFO] [2021-01-28 10:16:14] Storing 11757 TraitsReferences
[INFO] [2021-01-28 10:16:14] Processing group of 11757 in 12 groups of 1000
[INFO] [2021-01-28 10:16:15] Average Time: 0.073
[INFO] [2021-01-28 10:16:15] Total Time: 1s
[INFO] [2021-01-28 10:16:15] last 3 / first 3: 0.78
[INFO] [2021-01-28 10:16:15] Std.Dev: 0.0; Max: 0.12
[INFO] [2021-01-28 10:16:15] Storing 11760 Traits
[INFO] [2021-01-28 10:16:15] Processing group of 11760 in 12 groups of 1000
[INFO] [2021-01-28 10:16:19] Average Time: 0.279
[INFO] [2021-01-28 10:16:19] Total Time: 4s
[INFO] [2021-01-28 10:16:19] last 3 / first 3: 0.81
[INFO] [2021-01-28 10:16:19] Std.Dev: 0.03162277660168379; Max: 0.36
[STOP] [2021-01-28 10:16:19] parse_diff_and_store
[START] [2021-01-28 10:16:19] resolve_keys
[INFO] [2021-01-28 10:16:46] Occurrences to nodes (through scientific_names)...
[INFO] [2021-01-28 10:16:50] traits to occurrences...
[INFO] [2021-01-28 10:16:51] traits to nodes (through occurrences)...
[INFO] [2021-01-28 10:16:51] Traits to sex term...
[INFO] [2021-01-28 10:16:51] Traits to lifestage term...
[INFO] [2021-01-28 10:16:51] MetaTraits to traits...
[INFO] [2021-01-28 10:16:51] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2021-01-28 10:16:52] Assocs to occurrences...
[INFO] [2021-01-28 10:16:52] Assocs to nodes...
[INFO] [2021-01-28 10:16:52] Assoc to sex term...
[INFO] [2021-01-28 10:16:52] Assoc to lifestage term...
[INFO] [2021-01-28 10:16:52] MetaAssoc to assocs...
[STOP] [2021-01-28 10:16:52] resolve_keys
[START] [2021-01-28 10:16:52] hold_for_later_1
[STOP] [2021-01-28 10:16:52] hold_for_later_1
[START] [2021-01-28 10:16:52] hold_for_later_2
[STOP] [2021-01-28 10:16:52] hold_for_later_2
[START] [2021-01-28 10:16:52] resolve_missing_parents
[STOP] [2021-01-28 10:16:52] resolve_missing_parents
[START] [2021-01-28 10:16:52] rebuild_nodes
[START] [2021-01-28 10:16:52] Flattener#flatten
[START] [2021-01-28 10:16:52] Flattener#study_resource
[START] [2021-01-28 10:16:52] Flattener#build_ancestry
[STOP] [2021-01-28 10:16:53] Flattener#build_ancestry
[INFO] [2021-01-28 10:16:53] 8267 ancestry keys
[START] [2021-01-28 10:16:53] build_node_ancestors
[INFO] [2021-01-28 10:16:53] old ancestors deleted.
[STOP] [2021-01-28 10:16:53] build_node_ancestors
[WARN] [2021-01-28 10:16:53] Flattener: nothing to flatten! (Completely flat resource?)
[STOP] [2021-01-28 10:16:53] Flattener#flatten
[STOP] [2021-01-28 10:16:53] rebuild_nodes
[START] [2021-01-28 10:16:53] resolve_missing_media_owners
[STOP] [2021-01-28 10:16:53] resolve_missing_media_owners
[START] [2021-01-28 10:16:53] sanitize_media_verbatims
[STOP] [2021-01-28 10:16:53] sanitize_media_verbatims
[START] [2021-01-28 10:16:53] queue_downloads
[STOP] [2021-01-28 10:16:53] queue_downloads
[START] [2021-01-28 10:16:53] parse_names
[WARN] [2021-01-28 10:16:53] I see 16779 names which still need to be parsed.
[WARN] [2021-01-28 10:17:08] I see 561 names which still need to be parsed.
[WARN] [2021-01-28 10:17:10] I see 67 names which still need to be parsed.
[WARN] [2021-01-28 10:17:11] I see 34 names which still need to be parsed.
[WARN] [2021-01-28 10:17:12] I see 18 names which still need to be parsed.
[WARN] [2021-01-28 10:17:13] I see 10 names which still need to be parsed.
[WARN] [2021-01-28 10:17:14] I see 5 names which still need to be parsed.
[WARN] [2021-01-28 10:17:15] I see 3 names which still need to be parsed.
[WARN] [2021-01-28 10:17:16] I see 1 names which still need to be parsed.
[STOP] [2021-01-28 10:17:18] parse_names
[START] [2021-01-28 10:17:18] denormalize_canonical_names_to_nodes
[STOP] [2021-01-28 10:17:18] denormalize_canonical_names_to_nodes
[START] [2021-01-28 10:17:18] match_nodes
[START] [2021-01-28 10:17:18] map_all_nodes_to_pages
[STOP] [2021-01-28 10:28:40] map_all_nodes_to_pages
[INFO] [2021-01-28 10:28:40] 1718 Unmatched nodes (of 8267)! That's too many to output. Full list in /app/public/data/deltocephalinae/unmatched_nodes.txt ; First 10: Acinopterini (#87640624); Grypotina (#87640625); Euscelidius zetterstedti (#87640628); Neurotettix robustus (#87640632); Phlepsius similis (#87640634); Phlepsius quadripunctatus (#87640635); Phlepsius pulcher (#87640636); Phlepsius imparis (#87640637); Phlepsius desertorum (#87640638); Phlepsius pallidiventris (#87640639)
[START] [2021-01-28 10:28:40] update_nodes
[STOP] [2021-01-28 10:28:44] update_nodes
[STOP] [2021-01-28 10:28:44] match_nodes
[START] [2021-01-28 10:28:44] reindex_search
[STOP] [2021-01-28 10:28:56] reindex_search
[START] [2021-01-28 10:28:56] normalize_units
[STOP] [2021-01-28 10:28:56] normalize_units
[START] [2021-01-28 10:28:56] calculate_statistics
[2021-01-28 10:28:57] ZERO NODE ANCESTORS. Is this actually a completely flat resource?
[STOP] [2021-01-28 10:28:57] calculate_statistics
[START] [2021-01-28 10:28:57] complete_harvest_instance
[START] [2021-01-28 10:28:57] overall_tsv_creation
[INFO] [2021-01-28 10:28:57] Processing group of 8267 in 1 batches of 10000
[INFO] [2021-01-28 10:30:13] 11760 Traits (unfiltered)...
[INFO] [2021-01-28 10:31:35] 11760 Traits (filtered)...
[INFO] [2021-01-28 10:31:35] 0 Associations (filtered)...
[INFO] [2021-01-28 10:31:38] 35277 metadata added.
[INFO] [2021-01-28 10:31:38] 0 metadata added.
[INFO] [2021-01-28 10:32:11] Average Time: 160.13
[INFO] [2021-01-28 10:32:11] Total Time: 3m14s
[STOP] [2021-01-28 10:32:11] overall_tsv_creation
[INFO] [2021-01-28 10:32:11] Done. Check your files:
[INFO] [2021-01-28 10:32:11] (8267 lines) /app/public/data/deltocephalinae/publish_nodes.tsv
[INFO] [2021-01-28 10:32:11] (16779 lines) /app/public/data/deltocephalinae/publish_scientific_names.tsv
[INFO] [2021-01-28 10:32:11] (1899 lines) /app/public/data/deltocephalinae/publish_media.tsv
[INFO] [2021-01-28 10:32:11] (1899 lines) /app/public/data/deltocephalinae/publish_image_info.tsv
[INFO] [2021-01-28 10:32:11] (3 lines) /app/public/data/deltocephalinae/publish_vernaculars.tsv
[INFO] [2021-01-28 10:32:11] (346 lines) /app/public/data/deltocephalinae/publish_references.tsv
[INFO] [2021-01-28 10:32:11] (3798 lines) /app/public/data/deltocephalinae/publish_attributions.tsv
[INFO] [2021-01-28 10:32:11] (346 lines) /app/public/data/deltocephalinae/publish_referents.tsv
[INFO] [2021-01-28 10:32:11] (11761 lines) /app/public/data/deltocephalinae/publish_traits.tsv
[INFO] [2021-01-28 10:32:11] (35278 lines) /app/public/data/deltocephalinae/publish_metadata.tsv
[STOP] [2021-01-28 10:32:11] complete_harvest_instance
[START] [2021-01-28 10:32:11] completed
[STOP] [2021-01-28 10:32:11] completed
[STOP] [2021-01-28 10:32:11] logged process, took 992.71

Latest Process