Harvest for ECOCROP Created 29 Jun 17:06

Stage: completed
Fetched: 29 Jun 17:06
Validated: 29 Jun 17:06
Deltas Created 29 Jun 17:06
Units Normalized: 29 Jun 17:15
Ancestry Built: 29 Jun 17:10
Nodes Matched: 29 Jun 17:14
Names Parsed: 29 Jun 17:10
New Models Stored: 29 Jun 17:10
Indexed: 29 Jun 17:14
Completed: 29 Jun 17:17
Time to Harvest: less than a minute

Expected File Format Definitions

Harvesting Log (most recent first)

# Logfile created on 2020-06-22 16:50:34 -0400 by logger.rb/v1.4.2
[INFO] [2020-06-22 16:50:34] ## HARVEST: type = -harvest
[START] [2020-06-22 16:50:34] logged process
[START] [2020-06-22 16:50:34] create_harvest_instance
[STOP] [2020-06-22 16:50:34] create_harvest_instance
[START] [2020-06-22 16:50:34] fetch_files
[STOP] [2020-06-22 16:50:34] fetch_files
[START] [2020-06-22 16:50:34] validate_each_file
[STOP] [2020-06-22 16:50:34] validate_each_file
[ERR] [2020-06-22 16:50:34] Exceptions::ColumnUnmatched
[ERR] [2020-06-22 16:50:34] TOO MANY COLUMNS: nodes: taxonID,scientificName,family,kingdom
[ERR] [2020-06-22 16:50:34] ../models/resource_harvester.rb:138:in `block in validate_each_file'
[ERR] [2020-06-22 16:50:34] ../models/resource_harvester.rb:666:in `block in each_format'
[ERR] [2020-06-22 16:50:34] ../models/resource_harvester.rb:651:in `each_format'
[ERR] [2020-06-22 16:50:34] ../models/resource_harvester.rb:132:in `validate_each_file'
[ERR] [2020-06-22 16:50:34] ../models/resource_harvester.rb:86:in `block (3 levels) in start'
[ERR] [2020-06-22 16:50:34] ../models/logged_process.rb:19:in `run_step'
[ERR] [2020-06-22 16:50:34] ../models/resource_harvester.rb:86:in `block (2 levels) in start'
[ERR] [2020-06-22 16:50:34] ../models/resource_harvester.rb:75:in `each_key'
[ERR] [2020-06-22 16:50:34] ../models/resource_harvester.rb:75:in `block in start'
[ERR] [2020-06-22 16:50:34] ../models/resource.rb:151:in `lock'
[ERR] [2020-06-22 16:50:34] ../models/resource_harvester.rb:72:in `start'
[ERR] [2020-06-22 16:50:34] ../models/resource.rb:232:in `harvest'
[ERR] [2020-06-22 16:50:34] bin/rails:4:in `require'
[ERR] [2020-06-22 16:50:34] bin/rails:4:in `<main>'
[STOP] [2020-06-22 16:50:34] logged process, took 0.15
[INFO] [2020-06-22 16:53:03] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2020-06-22 16:53:08] ## remove_type: ScientificName
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.171] Removed 0 Scientificnames
[INFO] [2020-06-22 16:53:08] ## remove_type: Vernacular
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.174] Removed 0 Vernaculars
[INFO] [2020-06-22 16:53:08] ## remove_type: Article
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.177] Removed 0 Articles
[INFO] [2020-06-22 16:53:08] ## remove_type: Medium
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.181] Removed 0 Media
[INFO] [2020-06-22 16:53:08] ## remove_type: Trait
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.184] Removed 0 Traits
[INFO] [2020-06-22 16:53:08] ## remove_type: MetaTrait
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.187] Removed 0 Metatraits
[INFO] [2020-06-22 16:53:08] ## remove_type: OccurrenceMetadatum
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.190] Removed 0 Occurrencemetadata
[INFO] [2020-06-22 16:53:08] ## remove_type: Assoc
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.193] Removed 0 Assocs
[INFO] [2020-06-22 16:53:08] ## remove_type: MetaAssoc
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.196] Removed 0 Metaassocs
[INFO] [2020-06-22 16:53:08] ## remove_type: Identifier
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.198] Removed 0 Identifiers
[INFO] [2020-06-22 16:53:08] ## remove_type: Reference
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.201] Removed 0 References
[INFO] [2020-06-22 16:53:08] ## remove_type: Node
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.231] Removed 0 Nodes
[START] [2020-06-22 16:53:08] logged process
[START] [2020-06-22 16:53:08] Creating resource from OpenData
[START] [2020-06-22 16:53:09] logged process
[START] [2020-06-22 16:53:09] Parse meta.xml file and create formats with fields
[STOP] [2020-06-22 16:53:09] Parse meta.xml file and create formats with fields
[STOP] [2020-06-22 16:53:09] Creating resource from OpenData
[START] [2020-06-22 16:53:09] logged process
[START] [2020-06-22 16:53:09] create_harvest_instance
[STOP] [2020-06-22 16:53:10] create_harvest_instance
[START] [2020-06-22 16:53:10] fetch_files
[STOP] [2020-06-22 16:53:10] fetch_files
[START] [2020-06-22 16:53:10] validate_each_file
[STOP] [2020-06-22 16:53:13] validate_each_file
[START] [2020-06-22 16:53:13] convert_to_csv
[CMD] [2020-06-22 16:53:13] /usr/bin/sort /app/public/converted_csv/ECOCROP_nodes_21408.csv > /app/public/converted_csv/ECOCROP_nodes_21408.csv_sorted
[CMD] [2020-06-22 16:53:13] /usr/bin/sort /app/public/converted_csv/ECOCROP_occurrences_21409.csv > /app/public/converted_csv/ECOCROP_occurrences_21409.csv_sorted
[CMD] [2020-06-22 16:53:13] /usr/bin/sort /app/public/converted_csv/ECOCROP_measurements_21410.csv > /app/public/converted_csv/ECOCROP_measurements_21410.csv_sorted
[STOP] [2020-06-22 16:53:13] convert_to_csv
[START] [2020-06-22 16:53:13] calculate_delta
[CMD] [2020-06-22 16:53:13] echo "0a" > /app/public/diff/ECOCROP_nodes_21408.diff
[CMD] [2020-06-22 16:53:13] tail -n +1 /app/public/converted_csv/ECOCROP_nodes_21408.csv >> /app/public/diff/ECOCROP_nodes_21408.diff
[CMD] [2020-06-22 16:53:13] echo "." >> /app/public/diff/ECOCROP_nodes_21408.diff
[CMD] [2020-06-22 16:53:13] echo "0a" > /app/public/diff/ECOCROP_occurrences_21409.diff
[CMD] [2020-06-22 16:53:13] tail -n +1 /app/public/converted_csv/ECOCROP_occurrences_21409.csv >> /app/public/diff/ECOCROP_occurrences_21409.diff
[CMD] [2020-06-22 16:53:13] echo "." >> /app/public/diff/ECOCROP_occurrences_21409.diff
[CMD] [2020-06-22 16:53:13] echo "0a" > /app/public/diff/ECOCROP_measurements_21410.diff
[CMD] [2020-06-22 16:53:13] tail -n +1 /app/public/converted_csv/ECOCROP_measurements_21410.csv >> /app/public/diff/ECOCROP_measurements_21410.diff
[CMD] [2020-06-22 16:53:13] echo "." >> /app/public/diff/ECOCROP_measurements_21410.diff
[STOP] [2020-06-22 16:53:13] calculate_delta
[START] [2020-06-22 16:53:13] parse_diff_and_store
[INFO] [2020-06-22 16:53:13] Loading nodes diff file into memory (true lines)...
[WARN] [2020-06-22 16:53:13] Filtered Scientific Name `Setaria sphacelata  var. sericea` to `Setaria sphacelata var. sericea`
[INFO] [2020-06-22 16:53:14] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-06-22 16:53:14] Loading measurements diff file into memory (true lines)...
[WARN] [2020-06-22 16:53:14] Created value term for http://purl.obolibrary.org/obo/UBERON_0036016!
[INFO] [2020-06-22 16:57:21] Storing 2770 ScientificNames
[INFO] [2020-06-22 16:57:21] Processing group of 2770 in 3 groups of 1000
[INFO] [2020-06-22 16:57:22] Average Time: 0.293
[INFO] [2020-06-22 16:57:22] Total Time: 1s
[INFO] [2020-06-22 16:57:22] Storing 2770 Nodes
[INFO] [2020-06-22 16:57:22] Processing group of 2770 in 3 groups of 1000
[INFO] [2020-06-22 16:57:23] Average Time: 0.373
[INFO] [2020-06-22 16:57:23] Total Time: 2s
[INFO] [2020-06-22 16:57:23] Storing 2567 Occurrences
[INFO] [2020-06-22 16:57:23] Processing group of 2567 in 3 groups of 1000
[INFO] [2020-06-22 16:57:23] Average Time: 0.107
[INFO] [2020-06-22 16:57:23] Total Time: 1s
[INFO] [2020-06-22 16:57:23] Storing 55668 Traits
[INFO] [2020-06-22 16:57:23] Processing group of 55668 in 56 groups of 1000
[INFO] [2020-06-22 16:57:41] Average Time: 0.308
[INFO] [2020-06-22 16:57:41] Total Time: 18s
[INFO] [2020-06-22 16:57:41] last 3 / first 3: 0.66
[INFO] [2020-06-22 16:57:41] Std.Dev: 0.14491376746189438; Max: 1.2
[INFO] [2020-06-22 16:57:41] Storing 95893 MetaTraits
[INFO] [2020-06-22 16:57:41] Processing group of 95893 in 96 groups of 1000
[INFO] [2020-06-22 16:57:50] Average Time: 0.094
[INFO] [2020-06-22 16:57:50] Total Time: 10s
[INFO] [2020-06-22 16:57:50] last 3 / first 3: 0.87
[INFO] [2020-06-22 16:57:50] Std.Dev: 0.0; Max: 0.2
[STOP] [2020-06-22 16:57:50] parse_diff_and_store
[START] [2020-06-22 16:57:50] resolve_keys
[INFO] [2020-06-22 16:58:01] Occurrences to nodes (through scientific_names)...
[INFO] [2020-06-22 16:58:02] traits to occurrences...
[INFO] [2020-06-22 16:58:04] traits to nodes (through occurrences)...
[INFO] [2020-06-22 16:58:05] Traits to sex term...
[INFO] [2020-06-22 16:58:05] Traits to lifestage term...
[INFO] [2020-06-22 16:58:05] MetaTraits to traits...
[INFO] [2020-06-22 16:58:11] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2020-06-22 16:58:13] Assocs to occurrences...
[INFO] [2020-06-22 16:58:13] Assocs to nodes...
[INFO] [2020-06-22 16:58:13] Assoc to sex term...
[INFO] [2020-06-22 16:58:13] Assoc to lifestage term...
[STOP] [2020-06-22 16:58:13] resolve_keys
[START] [2020-06-22 16:58:13] hold_for_later_1
[STOP] [2020-06-22 16:58:13] hold_for_later_1
[START] [2020-06-22 16:58:13] hold_for_later_2
[STOP] [2020-06-22 16:58:13] hold_for_later_2
[START] [2020-06-22 16:58:13] resolve_missing_parents
[STOP] [2020-06-22 16:58:13] resolve_missing_parents
[START] [2020-06-22 16:58:13] rebuild_nodes
[START] [2020-06-22 16:58:13] Flattener#flatten
[START] [2020-06-22 16:58:13] Flattener#study_resource
[START] [2020-06-22 16:58:13] Flattener#build_ancestry
[STOP] [2020-06-22 16:58:13] Flattener#build_ancestry
[INFO] [2020-06-22 16:58:13] 2770 ancestry keys
[START] [2020-06-22 16:58:13] build_node_ancestors
[INFO] [2020-06-22 16:58:13] old ancestors deleted.
[STOP] [2020-06-22 16:58:13] build_node_ancestors
[START] [2020-06-22 16:58:14] Flattener#propagate_ancestor_ids
[STOP] [2020-06-22 16:58:14] Flattener#propagate_ancestor_ids
[STOP] [2020-06-22 16:58:14] Flattener#flatten
[STOP] [2020-06-22 16:58:14] rebuild_nodes
[START] [2020-06-22 16:58:14] resolve_missing_media_owners
[STOP] [2020-06-22 16:58:14] resolve_missing_media_owners
[START] [2020-06-22 16:58:14] sanitize_media_verbatims
[STOP] [2020-06-22 16:58:14] sanitize_media_verbatims
[START] [2020-06-22 16:58:14] queue_downloads
[STOP] [2020-06-22 16:58:14] queue_downloads
[START] [2020-06-22 16:58:14] parse_names
[WARN] [2020-06-22 16:58:14] I see 2770 names which still need to be parsed.
[STOP] [2020-06-22 16:58:17] parse_names
[START] [2020-06-22 16:58:17] denormalize_canonical_names_to_nodes
[STOP] [2020-06-22 16:58:17] denormalize_canonical_names_to_nodes
[START] [2020-06-22 16:58:17] match_nodes
[START] [2020-06-22 16:58:17] map_all_nodes_to_pages
[STOP] [2020-06-22 17:01:42] map_all_nodes_to_pages
[INFO] [2020-06-22 17:01:42] 182 Unmatched nodes (of 2770)! That's too many to output. First 10: Stylosanthes guianensis intermedia (#80104951); Trifolium agrarium (#80105019); Trifolium cherlei (#80105024); Acacia anticeps (#80105791); Acacia macdonnelliensis (#80105830); Acacia millifera (#80105837); Acacia parchcarpa (#80105845); Acacia xiphohylla (#80105871); Albizia lucida (#80105953); Albizia stipulata (#80106101)
[START] [2020-06-22 17:01:42] update_nodes
[STOP] [2020-06-22 17:01:43] update_nodes
[STOP] [2020-06-22 17:01:43] match_nodes
[START] [2020-06-22 17:01:43] reindex_search
[STOP] [2020-06-22 17:01:46] reindex_search
[START] [2020-06-22 17:01:46] normalize_units
[STOP] [2020-06-22 17:02:26] normalize_units
[START] [2020-06-22 17:02:26] calculate_statistics
[STOP] [2020-06-22 17:02:26] calculate_statistics
[START] [2020-06-22 17:02:26] complete_harvest_instance
[START] [2020-06-22 17:02:26] overall_tsv_creation
[INFO] [2020-06-22 17:02:26] Processing group of 2770 in 1 batches of 10000
[INFO] [2020-06-22 17:04:34] 35130 Traits (unfiltered)...
[INFO] [2020-06-22 17:04:48] 35130 Traits (filtered)...
[INFO] [2020-06-22 17:04:48] 0 Associations (filtered)...
[INFO] [2020-06-22 17:05:55] 116256 metadata added.
[INFO] [2020-06-22 17:05:55] 0 metadata added.
[INFO] [2020-06-22 17:05:55] Average Time: 111.29
[INFO] [2020-06-22 17:05:55] Total Time: 3m30s
[STOP] [2020-06-22 17:05:55] overall_tsv_creation
[INFO] [2020-06-22 17:05:55] Done. Check your files:
[INFO] [2020-06-22 17:05:55] (2770 lines) /app/public/data/ECOCROP/publish_nodes.tsv
[INFO] [2020-06-22 17:05:55] (5155 lines) /app/public/data/ECOCROP/publish_node_ancestors.tsv
[INFO] [2020-06-22 17:05:55] (2770 lines) /app/public/data/ECOCROP/publish_scientific_names.tsv
[INFO] [2020-06-22 17:05:55] (35131 lines) /app/public/data/ECOCROP/publish_traits.tsv
[INFO] [2020-06-22 17:05:55] (116257 lines) /app/public/data/ECOCROP/publish_metadata.tsv
[STOP] [2020-06-22 17:05:56] complete_harvest_instance
[START] [2020-06-22 17:05:56] completed
[STOP] [2020-06-22 17:05:56] completed
[STOP] [2020-06-22 17:05:56] logged process, took 766.85
[INFO] [2020-06-29 17:05:57] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2020-06-29 17:05:59] ## remove_type: ScientificName
[INFO] [2020-06-29 17:05:59] ++ Calling delete_all on 2770 instances...
[INFO] [2020-06-29 17:06:00] [17:06:00.209] Removed 2770 Scientificnames
[INFO] [2020-06-29 17:06:00] ## remove_type: Vernacular
[INFO] [2020-06-29 17:06:00] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-29 17:06:00] [17:06:00.212] Removed 0 Vernaculars
[INFO] [2020-06-29 17:06:00] ## remove_type: Article
[INFO] [2020-06-29 17:06:00] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-29 17:06:00] [17:06:00.215] Removed 0 Articles
[INFO] [2020-06-29 17:06:00] ## remove_type: Medium
[INFO] [2020-06-29 17:06:00] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-29 17:06:00] [17:06:00.218] Removed 0 Media
[INFO] [2020-06-29 17:06:00] ## remove_type: Trait
[INFO] [2020-06-29 17:06:00] ++ Calling delete_all on 55668 instances...
[INFO] [2020-06-29 17:06:09] [17:06:09.934] Removed 55668 Traits
[INFO] [2020-06-29 17:06:09] ## remove_type: MetaTrait
[INFO] [2020-06-29 17:06:10] ++ Calling delete_all on 95893 instances...
[INFO] [2020-06-29 17:06:14] [17:06:14.532] Removed 95893 Metatraits
[INFO] [2020-06-29 17:06:14] ## remove_type: OccurrenceMetadatum
[INFO] [2020-06-29 17:06:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-29 17:06:14] [17:06:14.535] Removed 0 Occurrencemetadata
[INFO] [2020-06-29 17:06:14] ## remove_type: Assoc
[INFO] [2020-06-29 17:06:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-29 17:06:14] [17:06:14.538] Removed 0 Assocs
[INFO] [2020-06-29 17:06:14] ## remove_type: MetaAssoc
[INFO] [2020-06-29 17:06:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-29 17:06:14] [17:06:14.541] Removed 0 Metaassocs
[INFO] [2020-06-29 17:06:14] ## remove_type: Identifier
[INFO] [2020-06-29 17:06:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-29 17:06:14] [17:06:14.543] Removed 0 Identifiers
[INFO] [2020-06-29 17:06:14] ## remove_type: Reference
[INFO] [2020-06-29 17:06:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-29 17:06:14] [17:06:14.546] Removed 0 References
[INFO] [2020-06-29 17:06:14] Starting batch with ID 80105282...
[INFO] [2020-06-29 17:06:15] Starting batch with ID 80105282...
[INFO] [2020-06-29 17:06:15] Starting batch with ID 80105282...
[INFO] [2020-06-29 17:06:16] ## remove_type: Node
[INFO] [2020-06-29 17:06:16] ++ Calling delete_all on 2770 instances...
[INFO] [2020-06-29 17:06:16] [17:06:16.746] Removed 2770 Nodes
[START] [2020-06-29 17:06:18] logged process
[START] [2020-06-29 17:06:18] Creating resource from OpenData
[START] [2020-06-29 17:06:19] logged process
[START] [2020-06-29 17:06:19] Parse meta.xml file and create formats with fields
[STOP] [2020-06-29 17:06:19] Parse meta.xml file and create formats with fields
[STOP] [2020-06-29 17:06:19] Creating resource from OpenData
[START] [2020-06-29 17:06:19] logged process
[START] [2020-06-29 17:06:19] create_harvest_instance
[STOP] [2020-06-29 17:06:21] create_harvest_instance
[START] [2020-06-29 17:06:21] fetch_files
[STOP] [2020-06-29 17:06:21] fetch_files
[START] [2020-06-29 17:06:21] validate_each_file
[STOP] [2020-06-29 17:06:23] validate_each_file
[START] [2020-06-29 17:06:23] convert_to_csv
[CMD] [2020-06-29 17:06:23] /usr/bin/sort /app/public/converted_csv/ECOCROP_nodes_21514.csv > /app/public/converted_csv/ECOCROP_nodes_21514.csv_sorted
[CMD] [2020-06-29 17:06:23] /usr/bin/sort /app/public/converted_csv/ECOCROP_occurrences_21515.csv > /app/public/converted_csv/ECOCROP_occurrences_21515.csv_sorted
[CMD] [2020-06-29 17:06:23] /usr/bin/sort /app/public/converted_csv/ECOCROP_measurements_21516.csv > /app/public/converted_csv/ECOCROP_measurements_21516.csv_sorted
[STOP] [2020-06-29 17:06:23] convert_to_csv
[START] [2020-06-29 17:06:23] calculate_delta
[CMD] [2020-06-29 17:06:23] echo "0a" > /app/public/diff/ECOCROP_nodes_21514.diff
[CMD] [2020-06-29 17:06:23] tail -n +1 /app/public/converted_csv/ECOCROP_nodes_21514.csv >> /app/public/diff/ECOCROP_nodes_21514.diff
[CMD] [2020-06-29 17:06:23] echo "." >> /app/public/diff/ECOCROP_nodes_21514.diff
[CMD] [2020-06-29 17:06:23] echo "0a" > /app/public/diff/ECOCROP_occurrences_21515.diff
[CMD] [2020-06-29 17:06:23] tail -n +1 /app/public/converted_csv/ECOCROP_occurrences_21515.csv >> /app/public/diff/ECOCROP_occurrences_21515.diff
[CMD] [2020-06-29 17:06:23] echo "." >> /app/public/diff/ECOCROP_occurrences_21515.diff
[CMD] [2020-06-29 17:06:23] echo "0a" > /app/public/diff/ECOCROP_measurements_21516.diff
[CMD] [2020-06-29 17:06:23] tail -n +1 /app/public/converted_csv/ECOCROP_measurements_21516.csv >> /app/public/diff/ECOCROP_measurements_21516.diff
[CMD] [2020-06-29 17:06:23] echo "." >> /app/public/diff/ECOCROP_measurements_21516.diff
[STOP] [2020-06-29 17:06:23] calculate_delta
[START] [2020-06-29 17:06:23] parse_diff_and_store
[INFO] [2020-06-29 17:06:23] Loading nodes diff file into memory (true lines)...
[WARN] [2020-06-29 17:06:23] Filtered Scientific Name `Setaria sphacelata  var. sericea` to `Setaria sphacelata var. sericea`
[INFO] [2020-06-29 17:06:24] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-06-29 17:06:24] Loading measurements diff file into memory (true lines)...
[INFO] [2020-06-29 17:09:57] Storing 2770 ScientificNames
[INFO] [2020-06-29 17:09:57] Processing group of 2770 in 3 groups of 1000
[INFO] [2020-06-29 17:09:58] Average Time: 0.357
[INFO] [2020-06-29 17:09:58] Total Time: 2s
[INFO] [2020-06-29 17:09:58] Storing 2770 Nodes
[INFO] [2020-06-29 17:09:58] Processing group of 2770 in 3 groups of 1000
[INFO] [2020-06-29 17:09:59] Average Time: 0.35
[INFO] [2020-06-29 17:09:59] Total Time: 2s
[INFO] [2020-06-29 17:09:59] Storing 2567 Occurrences
[INFO] [2020-06-29 17:09:59] Processing group of 2567 in 3 groups of 1000
[INFO] [2020-06-29 17:10:00] Average Time: 0.183
[INFO] [2020-06-29 17:10:00] Total Time: 1s
[INFO] [2020-06-29 17:10:00] Storing 46351 Traits
[INFO] [2020-06-29 17:10:00] Processing group of 46351 in 47 groups of 1000
[INFO] [2020-06-29 17:10:14] Average Time: 0.295
[INFO] [2020-06-29 17:10:14] Total Time: 15s
[INFO] [2020-06-29 17:10:14] last 3 / first 3: 0.69
[INFO] [2020-06-29 17:10:14] Std.Dev: 0.15165750888103102; Max: 1.17
[INFO] [2020-06-29 17:10:14] Storing 85746 MetaTraits
[INFO] [2020-06-29 17:10:14] Processing group of 85746 in 86 groups of 1000
[INFO] [2020-06-29 17:10:24] Average Time: 0.113
[INFO] [2020-06-29 17:10:24] Total Time: 10s
[INFO] [2020-06-29 17:10:24] last 3 / first 3: 0.81
[INFO] [2020-06-29 17:10:24] Std.Dev: 0.10488088481701516; Max: 0.95
[STOP] [2020-06-29 17:10:24] parse_diff_and_store
[START] [2020-06-29 17:10:24] resolve_keys
[INFO] [2020-06-29 17:10:34] Occurrences to nodes (through scientific_names)...
[INFO] [2020-06-29 17:10:34] traits to occurrences...
[INFO] [2020-06-29 17:10:36] traits to nodes (through occurrences)...
[INFO] [2020-06-29 17:10:37] Traits to sex term...
[INFO] [2020-06-29 17:10:37] Traits to lifestage term...
[INFO] [2020-06-29 17:10:38] MetaTraits to traits...
[INFO] [2020-06-29 17:10:43] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2020-06-29 17:10:44] Assocs to occurrences...
[INFO] [2020-06-29 17:10:44] Assocs to nodes...
[INFO] [2020-06-29 17:10:44] Assoc to sex term...
[INFO] [2020-06-29 17:10:44] Assoc to lifestage term...
[STOP] [2020-06-29 17:10:44] resolve_keys
[START] [2020-06-29 17:10:44] hold_for_later_1
[STOP] [2020-06-29 17:10:44] hold_for_later_1
[START] [2020-06-29 17:10:44] hold_for_later_2
[STOP] [2020-06-29 17:10:44] hold_for_later_2
[START] [2020-06-29 17:10:44] resolve_missing_parents
[STOP] [2020-06-29 17:10:44] resolve_missing_parents
[START] [2020-06-29 17:10:44] rebuild_nodes
[START] [2020-06-29 17:10:44] Flattener#flatten
[START] [2020-06-29 17:10:44] Flattener#study_resource
[START] [2020-06-29 17:10:44] Flattener#build_ancestry
[STOP] [2020-06-29 17:10:44] Flattener#build_ancestry
[INFO] [2020-06-29 17:10:44] 2770 ancestry keys
[START] [2020-06-29 17:10:44] build_node_ancestors
[INFO] [2020-06-29 17:10:44] old ancestors deleted.
[STOP] [2020-06-29 17:10:45] build_node_ancestors
[START] [2020-06-29 17:10:45] Flattener#propagate_ancestor_ids
[STOP] [2020-06-29 17:10:45] Flattener#propagate_ancestor_ids
[STOP] [2020-06-29 17:10:45] Flattener#flatten
[STOP] [2020-06-29 17:10:45] rebuild_nodes
[START] [2020-06-29 17:10:45] resolve_missing_media_owners
[STOP] [2020-06-29 17:10:45] resolve_missing_media_owners
[START] [2020-06-29 17:10:45] sanitize_media_verbatims
[STOP] [2020-06-29 17:10:45] sanitize_media_verbatims
[START] [2020-06-29 17:10:45] queue_downloads
[STOP] [2020-06-29 17:10:45] queue_downloads
[START] [2020-06-29 17:10:45] parse_names
[WARN] [2020-06-29 17:10:45] I see 2770 names which still need to be parsed.
[STOP] [2020-06-29 17:10:48] parse_names
[START] [2020-06-29 17:10:48] denormalize_canonical_names_to_nodes
[STOP] [2020-06-29 17:10:48] denormalize_canonical_names_to_nodes
[START] [2020-06-29 17:10:48] match_nodes
[START] [2020-06-29 17:10:48] map_all_nodes_to_pages
[STOP] [2020-06-29 17:14:34] map_all_nodes_to_pages
[INFO] [2020-06-29 17:14:34] 182 Unmatched nodes (of 2770)! That's too many to output. First 10: Stylosanthes guianensis intermedia (#80136729); Trifolium agrarium (#80136797); Trifolium cherlei (#80136802); Acacia anticeps (#80137569); Acacia macdonnelliensis (#80137608); Acacia millifera (#80137615); Acacia parchcarpa (#80137623); Acacia xiphohylla (#80137649); Albizia lucida (#80137731); Albizia stipulata (#80137879)
[START] [2020-06-29 17:14:34] update_nodes
[STOP] [2020-06-29 17:14:35] update_nodes
[STOP] [2020-06-29 17:14:35] match_nodes
[START] [2020-06-29 17:14:35] reindex_search
[STOP] [2020-06-29 17:14:38] reindex_search
[START] [2020-06-29 17:14:38] normalize_units
[STOP] [2020-06-29 17:15:19] normalize_units
[START] [2020-06-29 17:15:19] calculate_statistics
[STOP] [2020-06-29 17:15:19] calculate_statistics
[START] [2020-06-29 17:15:19] complete_harvest_instance
[START] [2020-06-29 17:15:19] overall_tsv_creation
[INFO] [2020-06-29 17:15:19] Processing group of 2770 in 1 batches of 10000
[INFO] [2020-06-29 17:16:19] 30492 Traits (unfiltered)...
[INFO] [2020-06-29 17:16:32] 30492 Traits (filtered)...
[INFO] [2020-06-29 17:16:32] 0 Associations (filtered)...
[INFO] [2020-06-29 17:17:35] 101453 metadata added.
[INFO] [2020-06-29 17:17:35] 0 metadata added.
[INFO] [2020-06-29 17:17:35] Average Time: 106.05
[INFO] [2020-06-29 17:17:35] Total Time: 2m17s
[STOP] [2020-06-29 17:17:35] overall_tsv_creation
[INFO] [2020-06-29 17:17:35] Done. Check your files:
[INFO] [2020-06-29 17:17:35] (2770 lines) /app/public/data/ECOCROP/publish_nodes.tsv
[INFO] [2020-06-29 17:17:35] (5155 lines) /app/public/data/ECOCROP/publish_node_ancestors.tsv
[INFO] [2020-06-29 17:17:35] (2770 lines) /app/public/data/ECOCROP/publish_scientific_names.tsv
[INFO] [2020-06-29 17:17:35] (30493 lines) /app/public/data/ECOCROP/publish_traits.tsv
[INFO] [2020-06-29 17:17:35] (101454 lines) /app/public/data/ECOCROP/publish_metadata.tsv
[STOP] [2020-06-29 17:17:35] complete_harvest_instance
[START] [2020-06-29 17:17:35] completed
[STOP] [2020-06-29 17:17:35] completed
[STOP] [2020-06-29 17:17:35] logged process, took 676.51

Latest Process