Harvest for ECOCROP Created 30 Dec 16:29

Stage: completed
Fetched: 30 Dec 16:29
Validated: 30 Dec 16:29
Deltas Created 30 Dec 16:29
Units Normalized: 30 Dec 16:34
Ancestry Built: 30 Dec 16:31
Nodes Matched: 30 Dec 16:33
Names Parsed: 30 Dec 16:31
New Models Stored: 30 Dec 16:30
Indexed: 30 Dec 16:33
Completed: 30 Dec 16:38
Time to Harvest: less than a minute

Expected File Format Definitions

Harvesting Log (most recent first)

# Logfile created on 2020-06-22 16:50:34 -0400 by logger.rb/v1.4.2
[INFO] [2020-06-22 16:50:34] ## HARVEST: type = -harvest
[START] [2020-06-22 16:50:34] logged process
[START] [2020-06-22 16:50:34] create_harvest_instance
[STOP] [2020-06-22 16:50:34] create_harvest_instance
[START] [2020-06-22 16:50:34] fetch_files
[STOP] [2020-06-22 16:50:34] fetch_files
[START] [2020-06-22 16:50:34] validate_each_file
[STOP] [2020-06-22 16:50:34] validate_each_file
[ERR] [2020-06-22 16:50:34] Exceptions::ColumnUnmatched
[ERR] [2020-06-22 16:50:34] TOO MANY COLUMNS: nodes: taxonID,scientificName,family,kingdom
[ERR] [2020-06-22 16:50:34] ../models/resource_harvester.rb:138:in `block in validate_each_file'
[ERR] [2020-06-22 16:50:34] ../models/resource_harvester.rb:666:in `block in each_format'
[ERR] [2020-06-22 16:50:34] ../models/resource_harvester.rb:651:in `each_format'
[ERR] [2020-06-22 16:50:34] ../models/resource_harvester.rb:132:in `validate_each_file'
[ERR] [2020-06-22 16:50:34] ../models/resource_harvester.rb:86:in `block (3 levels) in start'
[ERR] [2020-06-22 16:50:34] ../models/logged_process.rb:19:in `run_step'
[ERR] [2020-06-22 16:50:34] ../models/resource_harvester.rb:86:in `block (2 levels) in start'
[ERR] [2020-06-22 16:50:34] ../models/resource_harvester.rb:75:in `each_key'
[ERR] [2020-06-22 16:50:34] ../models/resource_harvester.rb:75:in `block in start'
[ERR] [2020-06-22 16:50:34] ../models/resource.rb:151:in `lock'
[ERR] [2020-06-22 16:50:34] ../models/resource_harvester.rb:72:in `start'
[ERR] [2020-06-22 16:50:34] ../models/resource.rb:232:in `harvest'
[ERR] [2020-06-22 16:50:34] bin/rails:4:in `require'
[ERR] [2020-06-22 16:50:34] bin/rails:4:in `<main>'
[STOP] [2020-06-22 16:50:34] logged process, took 0.15
[INFO] [2020-06-22 16:53:03] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2020-06-22 16:53:08] ## remove_type: ScientificName
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.171] Removed 0 Scientificnames
[INFO] [2020-06-22 16:53:08] ## remove_type: Vernacular
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.174] Removed 0 Vernaculars
[INFO] [2020-06-22 16:53:08] ## remove_type: Article
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.177] Removed 0 Articles
[INFO] [2020-06-22 16:53:08] ## remove_type: Medium
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.181] Removed 0 Media
[INFO] [2020-06-22 16:53:08] ## remove_type: Trait
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.184] Removed 0 Traits
[INFO] [2020-06-22 16:53:08] ## remove_type: MetaTrait
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.187] Removed 0 Metatraits
[INFO] [2020-06-22 16:53:08] ## remove_type: OccurrenceMetadatum
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.190] Removed 0 Occurrencemetadata
[INFO] [2020-06-22 16:53:08] ## remove_type: Assoc
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.193] Removed 0 Assocs
[INFO] [2020-06-22 16:53:08] ## remove_type: MetaAssoc
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.196] Removed 0 Metaassocs
[INFO] [2020-06-22 16:53:08] ## remove_type: Identifier
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.198] Removed 0 Identifiers
[INFO] [2020-06-22 16:53:08] ## remove_type: Reference
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.201] Removed 0 References
[INFO] [2020-06-22 16:53:08] ## remove_type: Node
[INFO] [2020-06-22 16:53:08] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-22 16:53:08] [16:53:08.231] Removed 0 Nodes
[START] [2020-06-22 16:53:08] logged process
[START] [2020-06-22 16:53:08] Creating resource from OpenData
[START] [2020-06-22 16:53:09] logged process
[START] [2020-06-22 16:53:09] Parse meta.xml file and create formats with fields
[STOP] [2020-06-22 16:53:09] Parse meta.xml file and create formats with fields
[STOP] [2020-06-22 16:53:09] Creating resource from OpenData
[START] [2020-06-22 16:53:09] logged process
[START] [2020-06-22 16:53:09] create_harvest_instance
[STOP] [2020-06-22 16:53:10] create_harvest_instance
[START] [2020-06-22 16:53:10] fetch_files
[STOP] [2020-06-22 16:53:10] fetch_files
[START] [2020-06-22 16:53:10] validate_each_file
[STOP] [2020-06-22 16:53:13] validate_each_file
[START] [2020-06-22 16:53:13] convert_to_csv
[CMD] [2020-06-22 16:53:13] /usr/bin/sort /app/public/converted_csv/ECOCROP_nodes_21408.csv > /app/public/converted_csv/ECOCROP_nodes_21408.csv_sorted
[CMD] [2020-06-22 16:53:13] /usr/bin/sort /app/public/converted_csv/ECOCROP_occurrences_21409.csv > /app/public/converted_csv/ECOCROP_occurrences_21409.csv_sorted
[CMD] [2020-06-22 16:53:13] /usr/bin/sort /app/public/converted_csv/ECOCROP_measurements_21410.csv > /app/public/converted_csv/ECOCROP_measurements_21410.csv_sorted
[STOP] [2020-06-22 16:53:13] convert_to_csv
[START] [2020-06-22 16:53:13] calculate_delta
[CMD] [2020-06-22 16:53:13] echo "0a" > /app/public/diff/ECOCROP_nodes_21408.diff
[CMD] [2020-06-22 16:53:13] tail -n +1 /app/public/converted_csv/ECOCROP_nodes_21408.csv >> /app/public/diff/ECOCROP_nodes_21408.diff
[CMD] [2020-06-22 16:53:13] echo "." >> /app/public/diff/ECOCROP_nodes_21408.diff
[CMD] [2020-06-22 16:53:13] echo "0a" > /app/public/diff/ECOCROP_occurrences_21409.diff
[CMD] [2020-06-22 16:53:13] tail -n +1 /app/public/converted_csv/ECOCROP_occurrences_21409.csv >> /app/public/diff/ECOCROP_occurrences_21409.diff
[CMD] [2020-06-22 16:53:13] echo "." >> /app/public/diff/ECOCROP_occurrences_21409.diff
[CMD] [2020-06-22 16:53:13] echo "0a" > /app/public/diff/ECOCROP_measurements_21410.diff
[CMD] [2020-06-22 16:53:13] tail -n +1 /app/public/converted_csv/ECOCROP_measurements_21410.csv >> /app/public/diff/ECOCROP_measurements_21410.diff
[CMD] [2020-06-22 16:53:13] echo "." >> /app/public/diff/ECOCROP_measurements_21410.diff
[STOP] [2020-06-22 16:53:13] calculate_delta
[START] [2020-06-22 16:53:13] parse_diff_and_store
[INFO] [2020-06-22 16:53:13] Loading nodes diff file into memory (true lines)...
[WARN] [2020-06-22 16:53:13] Filtered Scientific Name `Setaria sphacelata  var. sericea` to `Setaria sphacelata var. sericea`
[INFO] [2020-06-22 16:53:14] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-06-22 16:53:14] Loading measurements diff file into memory (true lines)...
[WARN] [2020-06-22 16:53:14] Created value term for http://purl.obolibrary.org/obo/UBERON_0036016!
[INFO] [2020-06-22 16:57:21] Storing 2770 ScientificNames
[INFO] [2020-06-22 16:57:21] Processing group of 2770 in 3 groups of 1000
[INFO] [2020-06-22 16:57:22] Average Time: 0.293
[INFO] [2020-06-22 16:57:22] Total Time: 1s
[INFO] [2020-06-22 16:57:22] Storing 2770 Nodes
[INFO] [2020-06-22 16:57:22] Processing group of 2770 in 3 groups of 1000
[INFO] [2020-06-22 16:57:23] Average Time: 0.373
[INFO] [2020-06-22 16:57:23] Total Time: 2s
[INFO] [2020-06-22 16:57:23] Storing 2567 Occurrences
[INFO] [2020-06-22 16:57:23] Processing group of 2567 in 3 groups of 1000
[INFO] [2020-06-22 16:57:23] Average Time: 0.107
[INFO] [2020-06-22 16:57:23] Total Time: 1s
[INFO] [2020-06-22 16:57:23] Storing 55668 Traits
[INFO] [2020-06-22 16:57:23] Processing group of 55668 in 56 groups of 1000
[INFO] [2020-06-22 16:57:41] Average Time: 0.308
[INFO] [2020-06-22 16:57:41] Total Time: 18s
[INFO] [2020-06-22 16:57:41] last 3 / first 3: 0.66
[INFO] [2020-06-22 16:57:41] Std.Dev: 0.14491376746189438; Max: 1.2
[INFO] [2020-06-22 16:57:41] Storing 95893 MetaTraits
[INFO] [2020-06-22 16:57:41] Processing group of 95893 in 96 groups of 1000
[INFO] [2020-06-22 16:57:50] Average Time: 0.094
[INFO] [2020-06-22 16:57:50] Total Time: 10s
[INFO] [2020-06-22 16:57:50] last 3 / first 3: 0.87
[INFO] [2020-06-22 16:57:50] Std.Dev: 0.0; Max: 0.2
[STOP] [2020-06-22 16:57:50] parse_diff_and_store
[START] [2020-06-22 16:57:50] resolve_keys
[INFO] [2020-06-22 16:58:01] Occurrences to nodes (through scientific_names)...
[INFO] [2020-06-22 16:58:02] traits to occurrences...
[INFO] [2020-06-22 16:58:04] traits to nodes (through occurrences)...
[INFO] [2020-06-22 16:58:05] Traits to sex term...
[INFO] [2020-06-22 16:58:05] Traits to lifestage term...
[INFO] [2020-06-22 16:58:05] MetaTraits to traits...
[INFO] [2020-06-22 16:58:11] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2020-06-22 16:58:13] Assocs to occurrences...
[INFO] [2020-06-22 16:58:13] Assocs to nodes...
[INFO] [2020-06-22 16:58:13] Assoc to sex term...
[INFO] [2020-06-22 16:58:13] Assoc to lifestage term...
[STOP] [2020-06-22 16:58:13] resolve_keys
[START] [2020-06-22 16:58:13] hold_for_later_1
[STOP] [2020-06-22 16:58:13] hold_for_later_1
[START] [2020-06-22 16:58:13] hold_for_later_2
[STOP] [2020-06-22 16:58:13] hold_for_later_2
[START] [2020-06-22 16:58:13] resolve_missing_parents
[STOP] [2020-06-22 16:58:13] resolve_missing_parents
[START] [2020-06-22 16:58:13] rebuild_nodes
[START] [2020-06-22 16:58:13] Flattener#flatten
[START] [2020-06-22 16:58:13] Flattener#study_resource
[START] [2020-06-22 16:58:13] Flattener#build_ancestry
[STOP] [2020-06-22 16:58:13] Flattener#build_ancestry
[INFO] [2020-06-22 16:58:13] 2770 ancestry keys
[START] [2020-06-22 16:58:13] build_node_ancestors
[INFO] [2020-06-22 16:58:13] old ancestors deleted.
[STOP] [2020-06-22 16:58:13] build_node_ancestors
[START] [2020-06-22 16:58:14] Flattener#propagate_ancestor_ids
[STOP] [2020-06-22 16:58:14] Flattener#propagate_ancestor_ids
[STOP] [2020-06-22 16:58:14] Flattener#flatten
[STOP] [2020-06-22 16:58:14] rebuild_nodes
[START] [2020-06-22 16:58:14] resolve_missing_media_owners
[STOP] [2020-06-22 16:58:14] resolve_missing_media_owners
[START] [2020-06-22 16:58:14] sanitize_media_verbatims
[STOP] [2020-06-22 16:58:14] sanitize_media_verbatims
[START] [2020-06-22 16:58:14] queue_downloads
[STOP] [2020-06-22 16:58:14] queue_downloads
[START] [2020-06-22 16:58:14] parse_names
[WARN] [2020-06-22 16:58:14] I see 2770 names which still need to be parsed.
[STOP] [2020-06-22 16:58:17] parse_names
[START] [2020-06-22 16:58:17] denormalize_canonical_names_to_nodes
[STOP] [2020-06-22 16:58:17] denormalize_canonical_names_to_nodes
[START] [2020-06-22 16:58:17] match_nodes
[START] [2020-06-22 16:58:17] map_all_nodes_to_pages
[STOP] [2020-06-22 17:01:42] map_all_nodes_to_pages
[INFO] [2020-06-22 17:01:42] 182 Unmatched nodes (of 2770)! That's too many to output. First 10: Stylosanthes guianensis intermedia (#80104951); Trifolium agrarium (#80105019); Trifolium cherlei (#80105024); Acacia anticeps (#80105791); Acacia macdonnelliensis (#80105830); Acacia millifera (#80105837); Acacia parchcarpa (#80105845); Acacia xiphohylla (#80105871); Albizia lucida (#80105953); Albizia stipulata (#80106101)
[START] [2020-06-22 17:01:42] update_nodes
[STOP] [2020-06-22 17:01:43] update_nodes
[STOP] [2020-06-22 17:01:43] match_nodes
[START] [2020-06-22 17:01:43] reindex_search
[STOP] [2020-06-22 17:01:46] reindex_search
[START] [2020-06-22 17:01:46] normalize_units
[STOP] [2020-06-22 17:02:26] normalize_units
[START] [2020-06-22 17:02:26] calculate_statistics
[STOP] [2020-06-22 17:02:26] calculate_statistics
[START] [2020-06-22 17:02:26] complete_harvest_instance
[START] [2020-06-22 17:02:26] overall_tsv_creation
[INFO] [2020-06-22 17:02:26] Processing group of 2770 in 1 batches of 10000
[INFO] [2020-06-22 17:04:34] 35130 Traits (unfiltered)...
[INFO] [2020-06-22 17:04:48] 35130 Traits (filtered)...
[INFO] [2020-06-22 17:04:48] 0 Associations (filtered)...
[INFO] [2020-06-22 17:05:55] 116256 metadata added.
[INFO] [2020-06-22 17:05:55] 0 metadata added.
[INFO] [2020-06-22 17:05:55] Average Time: 111.29
[INFO] [2020-06-22 17:05:55] Total Time: 3m30s
[STOP] [2020-06-22 17:05:55] overall_tsv_creation
[INFO] [2020-06-22 17:05:55] Done. Check your files:
[INFO] [2020-06-22 17:05:55] (2770 lines) /app/public/data/ECOCROP/publish_nodes.tsv
[INFO] [2020-06-22 17:05:55] (5155 lines) /app/public/data/ECOCROP/publish_node_ancestors.tsv
[INFO] [2020-06-22 17:05:55] (2770 lines) /app/public/data/ECOCROP/publish_scientific_names.tsv
[INFO] [2020-06-22 17:05:55] (35131 lines) /app/public/data/ECOCROP/publish_traits.tsv
[INFO] [2020-06-22 17:05:55] (116257 lines) /app/public/data/ECOCROP/publish_metadata.tsv
[STOP] [2020-06-22 17:05:56] complete_harvest_instance
[START] [2020-06-22 17:05:56] completed
[STOP] [2020-06-22 17:05:56] completed
[STOP] [2020-06-22 17:05:56] logged process, took 766.85
[INFO] [2020-06-29 17:05:57] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2020-06-29 17:05:59] ## remove_type: ScientificName
[INFO] [2020-06-29 17:05:59] ++ Calling delete_all on 2770 instances...
[INFO] [2020-06-29 17:06:00] [17:06:00.209] Removed 2770 Scientificnames
[INFO] [2020-06-29 17:06:00] ## remove_type: Vernacular
[INFO] [2020-06-29 17:06:00] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-29 17:06:00] [17:06:00.212] Removed 0 Vernaculars
[INFO] [2020-06-29 17:06:00] ## remove_type: Article
[INFO] [2020-06-29 17:06:00] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-29 17:06:00] [17:06:00.215] Removed 0 Articles
[INFO] [2020-06-29 17:06:00] ## remove_type: Medium
[INFO] [2020-06-29 17:06:00] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-29 17:06:00] [17:06:00.218] Removed 0 Media
[INFO] [2020-06-29 17:06:00] ## remove_type: Trait
[INFO] [2020-06-29 17:06:00] ++ Calling delete_all on 55668 instances...
[INFO] [2020-06-29 17:06:09] [17:06:09.934] Removed 55668 Traits
[INFO] [2020-06-29 17:06:09] ## remove_type: MetaTrait
[INFO] [2020-06-29 17:06:10] ++ Calling delete_all on 95893 instances...
[INFO] [2020-06-29 17:06:14] [17:06:14.532] Removed 95893 Metatraits
[INFO] [2020-06-29 17:06:14] ## remove_type: OccurrenceMetadatum
[INFO] [2020-06-29 17:06:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-29 17:06:14] [17:06:14.535] Removed 0 Occurrencemetadata
[INFO] [2020-06-29 17:06:14] ## remove_type: Assoc
[INFO] [2020-06-29 17:06:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-29 17:06:14] [17:06:14.538] Removed 0 Assocs
[INFO] [2020-06-29 17:06:14] ## remove_type: MetaAssoc
[INFO] [2020-06-29 17:06:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-29 17:06:14] [17:06:14.541] Removed 0 Metaassocs
[INFO] [2020-06-29 17:06:14] ## remove_type: Identifier
[INFO] [2020-06-29 17:06:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-29 17:06:14] [17:06:14.543] Removed 0 Identifiers
[INFO] [2020-06-29 17:06:14] ## remove_type: Reference
[INFO] [2020-06-29 17:06:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-06-29 17:06:14] [17:06:14.546] Removed 0 References
[INFO] [2020-06-29 17:06:14] Starting batch with ID 80105282...
[INFO] [2020-06-29 17:06:15] Starting batch with ID 80105282...
[INFO] [2020-06-29 17:06:15] Starting batch with ID 80105282...
[INFO] [2020-06-29 17:06:16] ## remove_type: Node
[INFO] [2020-06-29 17:06:16] ++ Calling delete_all on 2770 instances...
[INFO] [2020-06-29 17:06:16] [17:06:16.746] Removed 2770 Nodes
[START] [2020-06-29 17:06:18] logged process
[START] [2020-06-29 17:06:18] Creating resource from OpenData
[START] [2020-06-29 17:06:19] logged process
[START] [2020-06-29 17:06:19] Parse meta.xml file and create formats with fields
[STOP] [2020-06-29 17:06:19] Parse meta.xml file and create formats with fields
[STOP] [2020-06-29 17:06:19] Creating resource from OpenData
[START] [2020-06-29 17:06:19] logged process
[START] [2020-06-29 17:06:19] create_harvest_instance
[STOP] [2020-06-29 17:06:21] create_harvest_instance
[START] [2020-06-29 17:06:21] fetch_files
[STOP] [2020-06-29 17:06:21] fetch_files
[START] [2020-06-29 17:06:21] validate_each_file
[STOP] [2020-06-29 17:06:23] validate_each_file
[START] [2020-06-29 17:06:23] convert_to_csv
[CMD] [2020-06-29 17:06:23] /usr/bin/sort /app/public/converted_csv/ECOCROP_nodes_21514.csv > /app/public/converted_csv/ECOCROP_nodes_21514.csv_sorted
[CMD] [2020-06-29 17:06:23] /usr/bin/sort /app/public/converted_csv/ECOCROP_occurrences_21515.csv > /app/public/converted_csv/ECOCROP_occurrences_21515.csv_sorted
[CMD] [2020-06-29 17:06:23] /usr/bin/sort /app/public/converted_csv/ECOCROP_measurements_21516.csv > /app/public/converted_csv/ECOCROP_measurements_21516.csv_sorted
[STOP] [2020-06-29 17:06:23] convert_to_csv
[START] [2020-06-29 17:06:23] calculate_delta
[CMD] [2020-06-29 17:06:23] echo "0a" > /app/public/diff/ECOCROP_nodes_21514.diff
[CMD] [2020-06-29 17:06:23] tail -n +1 /app/public/converted_csv/ECOCROP_nodes_21514.csv >> /app/public/diff/ECOCROP_nodes_21514.diff
[CMD] [2020-06-29 17:06:23] echo "." >> /app/public/diff/ECOCROP_nodes_21514.diff
[CMD] [2020-06-29 17:06:23] echo "0a" > /app/public/diff/ECOCROP_occurrences_21515.diff
[CMD] [2020-06-29 17:06:23] tail -n +1 /app/public/converted_csv/ECOCROP_occurrences_21515.csv >> /app/public/diff/ECOCROP_occurrences_21515.diff
[CMD] [2020-06-29 17:06:23] echo "." >> /app/public/diff/ECOCROP_occurrences_21515.diff
[CMD] [2020-06-29 17:06:23] echo "0a" > /app/public/diff/ECOCROP_measurements_21516.diff
[CMD] [2020-06-29 17:06:23] tail -n +1 /app/public/converted_csv/ECOCROP_measurements_21516.csv >> /app/public/diff/ECOCROP_measurements_21516.diff
[CMD] [2020-06-29 17:06:23] echo "." >> /app/public/diff/ECOCROP_measurements_21516.diff
[STOP] [2020-06-29 17:06:23] calculate_delta
[START] [2020-06-29 17:06:23] parse_diff_and_store
[INFO] [2020-06-29 17:06:23] Loading nodes diff file into memory (true lines)...
[WARN] [2020-06-29 17:06:23] Filtered Scientific Name `Setaria sphacelata  var. sericea` to `Setaria sphacelata var. sericea`
[INFO] [2020-06-29 17:06:24] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-06-29 17:06:24] Loading measurements diff file into memory (true lines)...
[INFO] [2020-06-29 17:09:57] Storing 2770 ScientificNames
[INFO] [2020-06-29 17:09:57] Processing group of 2770 in 3 groups of 1000
[INFO] [2020-06-29 17:09:58] Average Time: 0.357
[INFO] [2020-06-29 17:09:58] Total Time: 2s
[INFO] [2020-06-29 17:09:58] Storing 2770 Nodes
[INFO] [2020-06-29 17:09:58] Processing group of 2770 in 3 groups of 1000
[INFO] [2020-06-29 17:09:59] Average Time: 0.35
[INFO] [2020-06-29 17:09:59] Total Time: 2s
[INFO] [2020-06-29 17:09:59] Storing 2567 Occurrences
[INFO] [2020-06-29 17:09:59] Processing group of 2567 in 3 groups of 1000
[INFO] [2020-06-29 17:10:00] Average Time: 0.183
[INFO] [2020-06-29 17:10:00] Total Time: 1s
[INFO] [2020-06-29 17:10:00] Storing 46351 Traits
[INFO] [2020-06-29 17:10:00] Processing group of 46351 in 47 groups of 1000
[INFO] [2020-06-29 17:10:14] Average Time: 0.295
[INFO] [2020-06-29 17:10:14] Total Time: 15s
[INFO] [2020-06-29 17:10:14] last 3 / first 3: 0.69
[INFO] [2020-06-29 17:10:14] Std.Dev: 0.15165750888103102; Max: 1.17
[INFO] [2020-06-29 17:10:14] Storing 85746 MetaTraits
[INFO] [2020-06-29 17:10:14] Processing group of 85746 in 86 groups of 1000
[INFO] [2020-06-29 17:10:24] Average Time: 0.113
[INFO] [2020-06-29 17:10:24] Total Time: 10s
[INFO] [2020-06-29 17:10:24] last 3 / first 3: 0.81
[INFO] [2020-06-29 17:10:24] Std.Dev: 0.10488088481701516; Max: 0.95
[STOP] [2020-06-29 17:10:24] parse_diff_and_store
[START] [2020-06-29 17:10:24] resolve_keys
[INFO] [2020-06-29 17:10:34] Occurrences to nodes (through scientific_names)...
[INFO] [2020-06-29 17:10:34] traits to occurrences...
[INFO] [2020-06-29 17:10:36] traits to nodes (through occurrences)...
[INFO] [2020-06-29 17:10:37] Traits to sex term...
[INFO] [2020-06-29 17:10:37] Traits to lifestage term...
[INFO] [2020-06-29 17:10:38] MetaTraits to traits...
[INFO] [2020-06-29 17:10:43] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2020-06-29 17:10:44] Assocs to occurrences...
[INFO] [2020-06-29 17:10:44] Assocs to nodes...
[INFO] [2020-06-29 17:10:44] Assoc to sex term...
[INFO] [2020-06-29 17:10:44] Assoc to lifestage term...
[STOP] [2020-06-29 17:10:44] resolve_keys
[START] [2020-06-29 17:10:44] hold_for_later_1
[STOP] [2020-06-29 17:10:44] hold_for_later_1
[START] [2020-06-29 17:10:44] hold_for_later_2
[STOP] [2020-06-29 17:10:44] hold_for_later_2
[START] [2020-06-29 17:10:44] resolve_missing_parents
[STOP] [2020-06-29 17:10:44] resolve_missing_parents
[START] [2020-06-29 17:10:44] rebuild_nodes
[START] [2020-06-29 17:10:44] Flattener#flatten
[START] [2020-06-29 17:10:44] Flattener#study_resource
[START] [2020-06-29 17:10:44] Flattener#build_ancestry
[STOP] [2020-06-29 17:10:44] Flattener#build_ancestry
[INFO] [2020-06-29 17:10:44] 2770 ancestry keys
[START] [2020-06-29 17:10:44] build_node_ancestors
[INFO] [2020-06-29 17:10:44] old ancestors deleted.
[STOP] [2020-06-29 17:10:45] build_node_ancestors
[START] [2020-06-29 17:10:45] Flattener#propagate_ancestor_ids
[STOP] [2020-06-29 17:10:45] Flattener#propagate_ancestor_ids
[STOP] [2020-06-29 17:10:45] Flattener#flatten
[STOP] [2020-06-29 17:10:45] rebuild_nodes
[START] [2020-06-29 17:10:45] resolve_missing_media_owners
[STOP] [2020-06-29 17:10:45] resolve_missing_media_owners
[START] [2020-06-29 17:10:45] sanitize_media_verbatims
[STOP] [2020-06-29 17:10:45] sanitize_media_verbatims
[START] [2020-06-29 17:10:45] queue_downloads
[STOP] [2020-06-29 17:10:45] queue_downloads
[START] [2020-06-29 17:10:45] parse_names
[WARN] [2020-06-29 17:10:45] I see 2770 names which still need to be parsed.
[STOP] [2020-06-29 17:10:48] parse_names
[START] [2020-06-29 17:10:48] denormalize_canonical_names_to_nodes
[STOP] [2020-06-29 17:10:48] denormalize_canonical_names_to_nodes
[START] [2020-06-29 17:10:48] match_nodes
[START] [2020-06-29 17:10:48] map_all_nodes_to_pages
[STOP] [2020-06-29 17:14:34] map_all_nodes_to_pages
[INFO] [2020-06-29 17:14:34] 182 Unmatched nodes (of 2770)! That's too many to output. First 10: Stylosanthes guianensis intermedia (#80136729); Trifolium agrarium (#80136797); Trifolium cherlei (#80136802); Acacia anticeps (#80137569); Acacia macdonnelliensis (#80137608); Acacia millifera (#80137615); Acacia parchcarpa (#80137623); Acacia xiphohylla (#80137649); Albizia lucida (#80137731); Albizia stipulata (#80137879)
[START] [2020-06-29 17:14:34] update_nodes
[STOP] [2020-06-29 17:14:35] update_nodes
[STOP] [2020-06-29 17:14:35] match_nodes
[START] [2020-06-29 17:14:35] reindex_search
[STOP] [2020-06-29 17:14:38] reindex_search
[START] [2020-06-29 17:14:38] normalize_units
[STOP] [2020-06-29 17:15:19] normalize_units
[START] [2020-06-29 17:15:19] calculate_statistics
[STOP] [2020-06-29 17:15:19] calculate_statistics
[START] [2020-06-29 17:15:19] complete_harvest_instance
[START] [2020-06-29 17:15:19] overall_tsv_creation
[INFO] [2020-06-29 17:15:19] Processing group of 2770 in 1 batches of 10000
[INFO] [2020-06-29 17:16:19] 30492 Traits (unfiltered)...
[INFO] [2020-06-29 17:16:32] 30492 Traits (filtered)...
[INFO] [2020-06-29 17:16:32] 0 Associations (filtered)...
[INFO] [2020-06-29 17:17:35] 101453 metadata added.
[INFO] [2020-06-29 17:17:35] 0 metadata added.
[INFO] [2020-06-29 17:17:35] Average Time: 106.05
[INFO] [2020-06-29 17:17:35] Total Time: 2m17s
[STOP] [2020-06-29 17:17:35] overall_tsv_creation
[INFO] [2020-06-29 17:17:35] Done. Check your files:
[INFO] [2020-06-29 17:17:35] (2770 lines) /app/public/data/ECOCROP/publish_nodes.tsv
[INFO] [2020-06-29 17:17:35] (5155 lines) /app/public/data/ECOCROP/publish_node_ancestors.tsv
[INFO] [2020-06-29 17:17:35] (2770 lines) /app/public/data/ECOCROP/publish_scientific_names.tsv
[INFO] [2020-06-29 17:17:35] (30493 lines) /app/public/data/ECOCROP/publish_traits.tsv
[INFO] [2020-06-29 17:17:35] (101454 lines) /app/public/data/ECOCROP/publish_metadata.tsv
[STOP] [2020-06-29 17:17:35] complete_harvest_instance
[START] [2020-06-29 17:17:35] completed
[STOP] [2020-06-29 17:17:35] completed
[STOP] [2020-06-29 17:17:35] logged process, took 676.51
[INFO] [2020-12-28 11:35:32] ## HARVEST: type = re_-harvest
[START] [2020-12-28 11:35:58] logged process: ca5be136aef877c71c74100a42de34a9e7a07645

[START] [2020-12-28 11:35:58] create_harvest_instance
[STOP] [2020-12-28 11:36:00] create_harvest_instance
[START] [2020-12-28 11:36:00] fetch_files
[STOP] [2020-12-28 11:36:00] fetch_files
[START] [2020-12-28 11:36:00] validate_each_file
[STOP] [2020-12-28 11:36:02] validate_each_file
[START] [2020-12-28 11:36:02] convert_to_csv
[CMD] [2020-12-28 11:36:02] /usr/bin/sort /app/public/converted_csv/ECOCROP_nodes_25854.csv > /app/public/converted_csv/ECOCROP_nodes_25854.csv_sorted
[CMD] [2020-12-28 11:36:02] /usr/bin/sort /app/public/converted_csv/ECOCROP_occurrences_25855.csv > /app/public/converted_csv/ECOCROP_occurrences_25855.csv_sorted
[CMD] [2020-12-28 11:36:02] /usr/bin/sort /app/public/converted_csv/ECOCROP_measurements_25856.csv > /app/public/converted_csv/ECOCROP_measurements_25856.csv_sorted
[STOP] [2020-12-28 11:36:02] convert_to_csv
[START] [2020-12-28 11:36:02] calculate_delta
[CMD] [2020-12-28 11:36:02] echo "0a" > /app/public/diff/ECOCROP_nodes_25854.diff
[CMD] [2020-12-28 11:36:02] tail -n +1 /app/public/converted_csv/ECOCROP_nodes_25854.csv >> /app/public/diff/ECOCROP_nodes_25854.diff
[CMD] [2020-12-28 11:36:02] echo "." >> /app/public/diff/ECOCROP_nodes_25854.diff
[CMD] [2020-12-28 11:36:02] echo "0a" > /app/public/diff/ECOCROP_occurrences_25855.diff
[CMD] [2020-12-28 11:36:02] tail -n +1 /app/public/converted_csv/ECOCROP_occurrences_25855.csv >> /app/public/diff/ECOCROP_occurrences_25855.diff
[CMD] [2020-12-28 11:36:02] echo "." >> /app/public/diff/ECOCROP_occurrences_25855.diff
[CMD] [2020-12-28 11:36:02] echo "0a" > /app/public/diff/ECOCROP_measurements_25856.diff
[CMD] [2020-12-28 11:36:02] tail -n +1 /app/public/converted_csv/ECOCROP_measurements_25856.csv >> /app/public/diff/ECOCROP_measurements_25856.diff
[CMD] [2020-12-28 11:36:02] echo "." >> /app/public/diff/ECOCROP_measurements_25856.diff
[STOP] [2020-12-28 11:36:02] calculate_delta
[START] [2020-12-28 11:36:02] parse_diff_and_store
[INFO] [2020-12-28 11:36:02] Loading nodes diff file into memory (true lines)...
[WARN] [2020-12-28 11:36:02] Filtered Scientific Name `Setaria sphacelata  var. sericea` to `Setaria sphacelata var. sericea`
[INFO] [2020-12-28 11:36:03] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-12-28 11:36:03] Loading measurements diff file into memory (true lines)...
[STOP] [2020-12-28 11:36:04] parse_diff_and_store
[ERR] [2020-12-28 11:36:04] RuntimeError
[ERR] [2020-12-28 11:36:04] Missing Term for URI `http://purl.bioontology.org/ontology/STY/T131`, must be added!
[ERR] [2020-12-28 11:36:04] ../models/store/model_builder.rb:635:in `fail_on_bad_uri'
[ERR] [2020-12-28 11:36:04] ../models/store/model_builder.rb:589:in `convert_trait_value'
[ERR] [2020-12-28 11:36:04] ../models/store/model_builder.rb:415:in `build_trait'
[ERR] [2020-12-28 11:36:04] ../models/store/model_builder.rb:28:in `build_models'
[ERR] [2020-12-28 11:36:04] ../models/resource_harvester.rb:359:in `block (3 levels) in parse_diff_and_store'
[ERR] [2020-12-28 11:36:04] ../models/csv_parser.rb:111:in `block in diff_as_hashes'
[ERR] [2020-12-28 11:36:04] ../models/csv_parser.rb:28:in `block in line_at_a_time'
[ERR] [2020-12-28 11:36:04] ../models/csv_parser.rb:25:in `line_at_a_time'
[ERR] [2020-12-28 11:36:04] ../models/csv_parser.rb:96:in `diff_as_hashes'
[ERR] [2020-12-28 11:36:04] ../models/resource_harvester.rb:313:in `block (2 levels) in parse_diff_and_store'
[ERR] [2020-12-28 11:36:04] ../models/logged_process.rb:70:in `enter_group'
[ERR] [2020-12-28 11:36:04] ../models/resource_harvester.rb:312:in `block in parse_diff_and_store'
[ERR] [2020-12-28 11:36:04] ../models/resource_harvester.rb:702:in `block in each_diff'
[ERR] [2020-12-28 11:36:04] ../models/resource_harvester.rb:691:in `each_diff'
[ERR] [2020-12-28 11:36:04] ../models/resource_harvester.rb:307:in `parse_diff_and_store'
[ERR] [2020-12-28 11:36:04] ../models/resource_harvester.rb:85:in `block (3 levels) in start'
[ERR] [2020-12-28 11:36:04] ../models/logged_process.rb:27:in `run_step'
[ERR] [2020-12-28 11:36:04] ../models/resource_harvester.rb:85:in `block (2 levels) in start'
[ERR] [2020-12-28 11:36:04] ../models/resource_harvester.rb:74:in `each_key'
[ERR] [2020-12-28 11:36:04] ../models/resource_harvester.rb:74:in `block in start'
[ERR] [2020-12-28 11:36:04] ../models/resource.rb:152:in `lock'
[ERR] [2020-12-28 11:36:04] ../models/resource_harvester.rb:72:in `start'
[ERR] [2020-12-28 11:36:04] ../models/resource.rb:238:in `harvest'
[ERR] [2020-12-28 11:36:04] ../models/resource.rb:243:in `re_harvest'
[ERR] [2020-12-28 11:36:04] bin/rails:4:in `require'
[ERR] [2020-12-28 11:36:04] bin/rails:4:in `<main>'
[STOP] [2020-12-28 11:36:04] logged process, took 6.24
[INFO] [2020-12-28 13:20:41] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2020-12-28 13:20:44] ## remove_type: ScientificName
[INFO] [2020-12-28 13:20:44] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-28 13:20:44] [13:20:44.598] Removed 0 Scientificnames
[INFO] [2020-12-28 13:20:44] ## remove_type: Vernacular
[INFO] [2020-12-28 13:20:44] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-28 13:20:44] [13:20:44.601] Removed 0 Vernaculars
[INFO] [2020-12-28 13:20:44] ## remove_type: Article
[INFO] [2020-12-28 13:20:44] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-28 13:20:44] [13:20:44.604] Removed 0 Articles
[INFO] [2020-12-28 13:20:44] ## remove_type: Medium
[INFO] [2020-12-28 13:20:44] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-28 13:20:44] [13:20:44.607] Removed 0 Media
[INFO] [2020-12-28 13:20:44] ## remove_type: Trait
[INFO] [2020-12-28 13:20:44] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-28 13:20:44] [13:20:44.610] Removed 0 Traits
[INFO] [2020-12-28 13:20:44] ## remove_type: MetaTrait
[INFO] [2020-12-28 13:20:44] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-28 13:20:44] [13:20:44.613] Removed 0 Metatraits
[INFO] [2020-12-28 13:20:44] ## remove_type: OccurrenceMetadatum
[INFO] [2020-12-28 13:20:44] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-28 13:20:44] [13:20:44.616] Removed 0 Occurrencemetadata
[INFO] [2020-12-28 13:20:44] ## remove_type: Assoc
[INFO] [2020-12-28 13:20:44] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-28 13:20:44] [13:20:44.619] Removed 0 Assocs
[INFO] [2020-12-28 13:20:44] ## remove_type: MetaAssoc
[INFO] [2020-12-28 13:20:44] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-28 13:20:44] [13:20:44.622] Removed 0 Metaassocs
[INFO] [2020-12-28 13:20:44] ## remove_type: Identifier
[INFO] [2020-12-28 13:20:44] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-28 13:20:44] [13:20:44.625] Removed 0 Identifiers
[INFO] [2020-12-28 13:20:44] ## remove_type: Reference
[INFO] [2020-12-28 13:20:44] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-28 13:20:44] [13:20:44.628] Removed 0 References
[INFO] [2020-12-28 13:20:44] ## remove_type: Node
[INFO] [2020-12-28 13:20:44] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-28 13:20:44] [13:20:44.986] Removed 0 Nodes
[START] [2020-12-28 13:20:45] logged process: ca5be136aef877c71c74100a42de34a9e7a07645

[START] [2020-12-28 13:20:45] Creating resource from OpenData
[START] [2020-12-28 13:20:45] logged process: ca5be136aef877c71c74100a42de34a9e7a07645

[START] [2020-12-28 13:20:45] Parse meta.xml file and create formats with fields
[STOP] [2020-12-28 13:20:45] Parse meta.xml file and create formats with fields
[STOP] [2020-12-28 13:20:45] Creating resource from OpenData
[START] [2020-12-28 13:20:45] logged process: ca5be136aef877c71c74100a42de34a9e7a07645

[START] [2020-12-28 13:20:45] create_harvest_instance
[STOP] [2020-12-28 13:20:47] create_harvest_instance
[START] [2020-12-28 13:20:47] fetch_files
[STOP] [2020-12-28 13:20:47] fetch_files
[START] [2020-12-28 13:20:47] validate_each_file
[STOP] [2020-12-28 13:20:49] validate_each_file
[START] [2020-12-28 13:20:49] convert_to_csv
[CMD] [2020-12-28 13:20:49] /usr/bin/sort /app/public/converted_csv/ECOCROP_nodes_25860.csv > /app/public/converted_csv/ECOCROP_nodes_25860.csv_sorted
[CMD] [2020-12-28 13:20:49] /usr/bin/sort /app/public/converted_csv/ECOCROP_occurrences_25861.csv > /app/public/converted_csv/ECOCROP_occurrences_25861.csv_sorted
[CMD] [2020-12-28 13:20:49] /usr/bin/sort /app/public/converted_csv/ECOCROP_measurements_25862.csv > /app/public/converted_csv/ECOCROP_measurements_25862.csv_sorted
[STOP] [2020-12-28 13:20:49] convert_to_csv
[START] [2020-12-28 13:20:49] calculate_delta
[CMD] [2020-12-28 13:20:49] echo "0a" > /app/public/diff/ECOCROP_nodes_25860.diff
[CMD] [2020-12-28 13:20:49] tail -n +1 /app/public/converted_csv/ECOCROP_nodes_25860.csv >> /app/public/diff/ECOCROP_nodes_25860.diff
[CMD] [2020-12-28 13:20:49] echo "." >> /app/public/diff/ECOCROP_nodes_25860.diff
[CMD] [2020-12-28 13:20:49] echo "0a" > /app/public/diff/ECOCROP_occurrences_25861.diff
[CMD] [2020-12-28 13:20:49] tail -n +1 /app/public/converted_csv/ECOCROP_occurrences_25861.csv >> /app/public/diff/ECOCROP_occurrences_25861.diff
[CMD] [2020-12-28 13:20:49] echo "." >> /app/public/diff/ECOCROP_occurrences_25861.diff
[CMD] [2020-12-28 13:20:49] echo "0a" > /app/public/diff/ECOCROP_measurements_25862.diff
[CMD] [2020-12-28 13:20:49] tail -n +1 /app/public/converted_csv/ECOCROP_measurements_25862.csv >> /app/public/diff/ECOCROP_measurements_25862.diff
[CMD] [2020-12-28 13:20:49] echo "." >> /app/public/diff/ECOCROP_measurements_25862.diff
[STOP] [2020-12-28 13:20:49] calculate_delta
[START] [2020-12-28 13:20:49] parse_diff_and_store
[INFO] [2020-12-28 13:20:49] Loading nodes diff file into memory (true lines)...
[WARN] [2020-12-28 13:20:50] Filtered Scientific Name `Setaria sphacelata  var. sericea` to `Setaria sphacelata var. sericea`
[INFO] [2020-12-28 13:20:50] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-12-28 13:20:50] Loading measurements diff file into memory (true lines)...
[STOP] [2020-12-28 13:20:51] parse_diff_and_store
[ERR] [2020-12-28 13:20:51] RuntimeError
[ERR] [2020-12-28 13:20:51] Missing Term for URI `http://purl.obolibrary.org/obo/CHEBI_48318`, must be added!
[ERR] [2020-12-28 13:20:51] ../models/store/model_builder.rb:635:in `fail_on_bad_uri'
[ERR] [2020-12-28 13:20:51] ../models/store/model_builder.rb:589:in `convert_trait_value'
[ERR] [2020-12-28 13:20:51] ../models/store/model_builder.rb:415:in `build_trait'
[ERR] [2020-12-28 13:20:51] ../models/store/model_builder.rb:28:in `build_models'
[ERR] [2020-12-28 13:20:51] ../models/resource_harvester.rb:359:in `block (3 levels) in parse_diff_and_store'
[ERR] [2020-12-28 13:20:51] ../models/csv_parser.rb:111:in `block in diff_as_hashes'
[ERR] [2020-12-28 13:20:51] ../models/csv_parser.rb:28:in `block in line_at_a_time'
[ERR] [2020-12-28 13:20:51] ../models/csv_parser.rb:25:in `line_at_a_time'
[ERR] [2020-12-28 13:20:51] ../models/csv_parser.rb:96:in `diff_as_hashes'
[ERR] [2020-12-28 13:20:51] ../models/resource_harvester.rb:313:in `block (2 levels) in parse_diff_and_store'
[ERR] [2020-12-28 13:20:51] ../models/logged_process.rb:70:in `enter_group'
[ERR] [2020-12-28 13:20:51] ../models/resource_harvester.rb:312:in `block in parse_diff_and_store'
[ERR] [2020-12-28 13:20:51] ../models/resource_harvester.rb:702:in `block in each_diff'
[ERR] [2020-12-28 13:20:51] ../models/resource_harvester.rb:691:in `each_diff'
[ERR] [2020-12-28 13:20:51] ../models/resource_harvester.rb:307:in `parse_diff_and_store'
[ERR] [2020-12-28 13:20:51] ../models/resource_harvester.rb:85:in `block (3 levels) in start'
[ERR] [2020-12-28 13:20:51] ../models/logged_process.rb:27:in `run_step'
[ERR] [2020-12-28 13:20:51] ../models/resource_harvester.rb:85:in `block (2 levels) in start'
[ERR] [2020-12-28 13:20:51] ../models/resource_harvester.rb:74:in `each_key'
[ERR] [2020-12-28 13:20:51] ../models/resource_harvester.rb:74:in `block in start'
[ERR] [2020-12-28 13:20:51] ../models/resource.rb:152:in `lock'
[ERR] [2020-12-28 13:20:51] ../models/resource_harvester.rb:72:in `start'
[ERR] [2020-12-28 13:20:51] ../models/resource.rb:238:in `harvest'
[ERR] [2020-12-28 13:20:51] ../models/resource.rb:214:in `re_download_opendata_and_harvest'
[ERR] [2020-12-28 13:20:51] bin/rails:4:in `require'
[ERR] [2020-12-28 13:20:51] bin/rails:4:in `<main>'
[STOP] [2020-12-28 13:20:51] logged process, took 5.93
[INFO] [2020-12-29 10:17:02] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2020-12-29 10:17:05] ## remove_type: ScientificName
[INFO] [2020-12-29 10:17:05] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 10:17:05] [10:17:05.584] Removed 0 Scientificnames
[INFO] [2020-12-29 10:17:05] ## remove_type: Vernacular
[INFO] [2020-12-29 10:17:05] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 10:17:05] [10:17:05.587] Removed 0 Vernaculars
[INFO] [2020-12-29 10:17:05] ## remove_type: Article
[INFO] [2020-12-29 10:17:05] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 10:17:05] [10:17:05.590] Removed 0 Articles
[INFO] [2020-12-29 10:17:05] ## remove_type: Medium
[INFO] [2020-12-29 10:17:05] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 10:17:05] [10:17:05.594] Removed 0 Media
[INFO] [2020-12-29 10:17:05] ## remove_type: Trait
[INFO] [2020-12-29 10:17:05] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 10:17:05] [10:17:05.597] Removed 0 Traits
[INFO] [2020-12-29 10:17:05] ## remove_type: MetaTrait
[INFO] [2020-12-29 10:17:05] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 10:17:05] [10:17:05.600] Removed 0 Metatraits
[INFO] [2020-12-29 10:17:05] ## remove_type: OccurrenceMetadatum
[INFO] [2020-12-29 10:17:05] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 10:17:05] [10:17:05.603] Removed 0 Occurrencemetadata
[INFO] [2020-12-29 10:17:05] ## remove_type: Assoc
[INFO] [2020-12-29 10:17:05] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 10:17:05] [10:17:05.608] Removed 0 Assocs
[INFO] [2020-12-29 10:17:05] ## remove_type: MetaAssoc
[INFO] [2020-12-29 10:17:05] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 10:17:05] [10:17:05.611] Removed 0 Metaassocs
[INFO] [2020-12-29 10:17:05] ## remove_type: Identifier
[INFO] [2020-12-29 10:17:05] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 10:17:05] [10:17:05.614] Removed 0 Identifiers
[INFO] [2020-12-29 10:17:05] ## remove_type: Reference
[INFO] [2020-12-29 10:17:05] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 10:17:05] [10:17:05.617] Removed 0 References
[INFO] [2020-12-29 10:17:05] ## remove_type: Node
[INFO] [2020-12-29 10:17:05] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 10:17:05] [10:17:05.637] Removed 0 Nodes
[START] [2020-12-29 10:17:06] logged process: ca5be136aef877c71c74100a42de34a9e7a07645

[START] [2020-12-29 10:17:06] Creating resource from OpenData
[START] [2020-12-29 10:17:10] logged process: ca5be136aef877c71c74100a42de34a9e7a07645

[START] [2020-12-29 10:17:10] Parse meta.xml file and create formats with fields
[STOP] [2020-12-29 10:17:11] Parse meta.xml file and create formats with fields
[STOP] [2020-12-29 10:17:11] Creating resource from OpenData
[START] [2020-12-29 10:17:11] logged process: ca5be136aef877c71c74100a42de34a9e7a07645

[START] [2020-12-29 10:17:11] create_harvest_instance
[STOP] [2020-12-29 10:17:12] create_harvest_instance
[START] [2020-12-29 10:17:12] fetch_files
[STOP] [2020-12-29 10:17:12] fetch_files
[START] [2020-12-29 10:17:12] validate_each_file
[STOP] [2020-12-29 10:17:14] validate_each_file
[START] [2020-12-29 10:17:14] convert_to_csv
[CMD] [2020-12-29 10:17:14] /usr/bin/sort /app/public/converted_csv/ECOCROP_nodes_25956.csv > /app/public/converted_csv/ECOCROP_nodes_25956.csv_sorted
[CMD] [2020-12-29 10:17:14] /usr/bin/sort /app/public/converted_csv/ECOCROP_occurrences_25957.csv > /app/public/converted_csv/ECOCROP_occurrences_25957.csv_sorted
[CMD] [2020-12-29 10:17:14] /usr/bin/sort /app/public/converted_csv/ECOCROP_measurements_25958.csv > /app/public/converted_csv/ECOCROP_measurements_25958.csv_sorted
[STOP] [2020-12-29 10:17:14] convert_to_csv
[START] [2020-12-29 10:17:14] calculate_delta
[CMD] [2020-12-29 10:17:14] echo "0a" > /app/public/diff/ECOCROP_nodes_25956.diff
[CMD] [2020-12-29 10:17:14] tail -n +1 /app/public/converted_csv/ECOCROP_nodes_25956.csv >> /app/public/diff/ECOCROP_nodes_25956.diff
[CMD] [2020-12-29 10:17:14] echo "." >> /app/public/diff/ECOCROP_nodes_25956.diff
[CMD] [2020-12-29 10:17:14] echo "0a" > /app/public/diff/ECOCROP_occurrences_25957.diff
[CMD] [2020-12-29 10:17:14] tail -n +1 /app/public/converted_csv/ECOCROP_occurrences_25957.csv >> /app/public/diff/ECOCROP_occurrences_25957.diff
[CMD] [2020-12-29 10:17:14] echo "." >> /app/public/diff/ECOCROP_occurrences_25957.diff
[CMD] [2020-12-29 10:17:14] echo "0a" > /app/public/diff/ECOCROP_measurements_25958.diff
[CMD] [2020-12-29 10:17:14] tail -n +1 /app/public/converted_csv/ECOCROP_measurements_25958.csv >> /app/public/diff/ECOCROP_measurements_25958.diff
[CMD] [2020-12-29 10:17:14] echo "." >> /app/public/diff/ECOCROP_measurements_25958.diff
[STOP] [2020-12-29 10:17:14] calculate_delta
[START] [2020-12-29 10:17:14] parse_diff_and_store
[INFO] [2020-12-29 10:17:14] Loading nodes diff file into memory (true lines)...
[WARN] [2020-12-29 10:17:15] Filtered Scientific Name `Setaria sphacelata  var. sericea` to `Setaria sphacelata var. sericea`
[INFO] [2020-12-29 10:17:15] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-12-29 10:17:16] Loading measurements diff file into memory (true lines)...
[STOP] [2020-12-29 10:17:24] parse_diff_and_store
[ERR] [2020-12-29 10:17:24] RuntimeError
[ERR] [2020-12-29 10:17:24] Missing Term for URI `http://eol.org/schema/terms/OptimalGrowthpH`, must be added!
[ERR] [2020-12-29 10:17:24] ../models/store/model_builder.rb:635:in `fail_on_bad_uri'
[ERR] [2020-12-29 10:17:24] ../models/store/model_builder.rb:408:in `build_trait'
[ERR] [2020-12-29 10:17:24] ../models/store/model_builder.rb:28:in `build_models'
[ERR] [2020-12-29 10:17:24] ../models/resource_harvester.rb:359:in `block (3 levels) in parse_diff_and_store'
[ERR] [2020-12-29 10:17:24] ../models/csv_parser.rb:111:in `block in diff_as_hashes'
[ERR] [2020-12-29 10:17:24] ../models/csv_parser.rb:28:in `block in line_at_a_time'
[ERR] [2020-12-29 10:17:24] ../models/csv_parser.rb:25:in `line_at_a_time'
[ERR] [2020-12-29 10:17:24] ../models/csv_parser.rb:96:in `diff_as_hashes'
[ERR] [2020-12-29 10:17:24] ../models/resource_harvester.rb:313:in `block (2 levels) in parse_diff_and_store'
[ERR] [2020-12-29 10:17:24] ../models/logged_process.rb:70:in `enter_group'
[ERR] [2020-12-29 10:17:24] ../models/resource_harvester.rb:312:in `block in parse_diff_and_store'
[ERR] [2020-12-29 10:17:24] ../models/resource_harvester.rb:702:in `block in each_diff'
[ERR] [2020-12-29 10:17:24] ../models/resource_harvester.rb:691:in `each_diff'
[ERR] [2020-12-29 10:17:24] ../models/resource_harvester.rb:307:in `parse_diff_and_store'
[ERR] [2020-12-29 10:17:24] ../models/resource_harvester.rb:85:in `block (3 levels) in start'
[ERR] [2020-12-29 10:17:24] ../models/logged_process.rb:27:in `run_step'
[ERR] [2020-12-29 10:17:24] ../models/resource_harvester.rb:85:in `block (2 levels) in start'
[ERR] [2020-12-29 10:17:24] ../models/resource_harvester.rb:74:in `each_key'
[ERR] [2020-12-29 10:17:24] ../models/resource_harvester.rb:74:in `block in start'
[ERR] [2020-12-29 10:17:24] ../models/resource.rb:152:in `lock'
[ERR] [2020-12-29 10:17:24] ../models/resource_harvester.rb:72:in `start'
[ERR] [2020-12-29 10:17:24] ../models/resource.rb:238:in `harvest'
[ERR] [2020-12-29 10:17:24] ../models/resource.rb:214:in `re_download_opendata_and_harvest'
[ERR] [2020-12-29 10:17:24] bin/rails:4:in `require'
[ERR] [2020-12-29 10:17:24] bin/rails:4:in `<main>'
[STOP] [2020-12-29 10:17:24] logged process, took 12.95
[INFO] [2020-12-29 12:48:10] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2020-12-29 12:48:14] ## remove_type: ScientificName
[INFO] [2020-12-29 12:48:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 12:48:14] [12:48:14.640] Removed 0 Scientificnames
[INFO] [2020-12-29 12:48:14] ## remove_type: Vernacular
[INFO] [2020-12-29 12:48:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 12:48:14] [12:48:14.642] Removed 0 Vernaculars
[INFO] [2020-12-29 12:48:14] ## remove_type: Article
[INFO] [2020-12-29 12:48:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 12:48:14] [12:48:14.644] Removed 0 Articles
[INFO] [2020-12-29 12:48:14] ## remove_type: Medium
[INFO] [2020-12-29 12:48:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 12:48:14] [12:48:14.646] Removed 0 Media
[INFO] [2020-12-29 12:48:14] ## remove_type: Trait
[INFO] [2020-12-29 12:48:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 12:48:14] [12:48:14.649] Removed 0 Traits
[INFO] [2020-12-29 12:48:14] ## remove_type: MetaTrait
[INFO] [2020-12-29 12:48:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 12:48:14] [12:48:14.655] Removed 0 Metatraits
[INFO] [2020-12-29 12:48:14] ## remove_type: OccurrenceMetadatum
[INFO] [2020-12-29 12:48:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 12:48:14] [12:48:14.670] Removed 0 Occurrencemetadata
[INFO] [2020-12-29 12:48:14] ## remove_type: Assoc
[INFO] [2020-12-29 12:48:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 12:48:14] [12:48:14.672] Removed 0 Assocs
[INFO] [2020-12-29 12:48:14] ## remove_type: MetaAssoc
[INFO] [2020-12-29 12:48:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 12:48:14] [12:48:14.674] Removed 0 Metaassocs
[INFO] [2020-12-29 12:48:14] ## remove_type: Identifier
[INFO] [2020-12-29 12:48:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 12:48:14] [12:48:14.677] Removed 0 Identifiers
[INFO] [2020-12-29 12:48:14] ## remove_type: Reference
[INFO] [2020-12-29 12:48:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 12:48:14] [12:48:14.679] Removed 0 References
[INFO] [2020-12-29 12:48:14] ## remove_type: Node
[INFO] [2020-12-29 12:48:14] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-29 12:48:14] [12:48:14.711] Removed 0 Nodes
[START] [2020-12-29 12:48:15] logged process: ca5be136aef877c71c74100a42de34a9e7a07645

[START] [2020-12-29 12:48:15] Creating resource from OpenData
[START] [2020-12-29 12:48:20] logged process: ca5be136aef877c71c74100a42de34a9e7a07645

[START] [2020-12-29 12:48:20] Parse meta.xml file and create formats with fields
[STOP] [2020-12-29 12:48:21] Parse meta.xml file and create formats with fields
[STOP] [2020-12-29 12:48:21] Creating resource from OpenData
[START] [2020-12-29 12:48:21] logged process: ca5be136aef877c71c74100a42de34a9e7a07645

[START] [2020-12-29 12:48:21] create_harvest_instance
[STOP] [2020-12-29 12:48:22] create_harvest_instance
[START] [2020-12-29 12:48:22] fetch_files
[STOP] [2020-12-29 12:48:22] fetch_files
[START] [2020-12-29 12:48:22] validate_each_file
[STOP] [2020-12-29 12:48:25] validate_each_file
[START] [2020-12-29 12:48:25] convert_to_csv
[CMD] [2020-12-29 12:48:25] /usr/bin/sort /app/public/converted_csv/ECOCROP_nodes_25995.csv > /app/public/converted_csv/ECOCROP_nodes_25995.csv_sorted
[CMD] [2020-12-29 12:48:25] /usr/bin/sort /app/public/converted_csv/ECOCROP_occurrences_25996.csv > /app/public/converted_csv/ECOCROP_occurrences_25996.csv_sorted
[CMD] [2020-12-29 12:48:25] /usr/bin/sort /app/public/converted_csv/ECOCROP_measurements_25997.csv > /app/public/converted_csv/ECOCROP_measurements_25997.csv_sorted
[STOP] [2020-12-29 12:48:25] convert_to_csv
[START] [2020-12-29 12:48:25] calculate_delta
[CMD] [2020-12-29 12:48:25] echo "0a" > /app/public/diff/ECOCROP_nodes_25995.diff
[CMD] [2020-12-29 12:48:25] tail -n +1 /app/public/converted_csv/ECOCROP_nodes_25995.csv >> /app/public/diff/ECOCROP_nodes_25995.diff
[CMD] [2020-12-29 12:48:25] echo "." >> /app/public/diff/ECOCROP_nodes_25995.diff
[CMD] [2020-12-29 12:48:25] echo "0a" > /app/public/diff/ECOCROP_occurrences_25996.diff
[CMD] [2020-12-29 12:48:25] tail -n +1 /app/public/converted_csv/ECOCROP_occurrences_25996.csv >> /app/public/diff/ECOCROP_occurrences_25996.diff
[CMD] [2020-12-29 12:48:25] echo "." >> /app/public/diff/ECOCROP_occurrences_25996.diff
[CMD] [2020-12-29 12:48:25] echo "0a" > /app/public/diff/ECOCROP_measurements_25997.diff
[CMD] [2020-12-29 12:48:25] tail -n +1 /app/public/converted_csv/ECOCROP_measurements_25997.csv >> /app/public/diff/ECOCROP_measurements_25997.diff
[CMD] [2020-12-29 12:48:25] echo "." >> /app/public/diff/ECOCROP_measurements_25997.diff
[STOP] [2020-12-29 12:48:25] calculate_delta
[START] [2020-12-29 12:48:25] parse_diff_and_store
[INFO] [2020-12-29 12:48:25] Loading nodes diff file into memory (true lines)...
[WARN] [2020-12-29 12:48:25] Filtered Scientific Name `Setaria sphacelata  var. sericea` to `Setaria sphacelata var. sericea`
[INFO] [2020-12-29 12:48:26] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-12-29 12:48:26] Loading measurements diff file into memory (true lines)...
[STOP] [2020-12-29 12:48:36] parse_diff_and_store
[ERR] [2020-12-29 12:48:36] RuntimeError
[ERR] [2020-12-29 12:48:36] Missing Term for URI `http://eol.org/schema/terms/OptimalGrowthpH`, must be added!
[ERR] [2020-12-29 12:48:36] ../models/store/model_builder.rb:635:in `fail_on_bad_uri'
[ERR] [2020-12-29 12:48:36] ../models/store/model_builder.rb:408:in `build_trait'
[ERR] [2020-12-29 12:48:36] ../models/store/model_builder.rb:28:in `build_models'
[ERR] [2020-12-29 12:48:36] ../models/resource_harvester.rb:359:in `block (3 levels) in parse_diff_and_store'
[ERR] [2020-12-29 12:48:36] ../models/csv_parser.rb:111:in `block in diff_as_hashes'
[ERR] [2020-12-29 12:48:36] ../models/csv_parser.rb:28:in `block in line_at_a_time'
[ERR] [2020-12-29 12:48:36] ../models/csv_parser.rb:25:in `line_at_a_time'
[ERR] [2020-12-29 12:48:36] ../models/csv_parser.rb:96:in `diff_as_hashes'
[ERR] [2020-12-29 12:48:36] ../models/resource_harvester.rb:313:in `block (2 levels) in parse_diff_and_store'
[ERR] [2020-12-29 12:48:36] ../models/logged_process.rb:70:in `enter_group'
[ERR] [2020-12-29 12:48:36] ../models/resource_harvester.rb:312:in `block in parse_diff_and_store'
[ERR] [2020-12-29 12:48:36] ../models/resource_harvester.rb:702:in `block in each_diff'
[ERR] [2020-12-29 12:48:36] ../models/resource_harvester.rb:691:in `each_diff'
[ERR] [2020-12-29 12:48:36] ../models/resource_harvester.rb:307:in `parse_diff_and_store'
[ERR] [2020-12-29 12:48:36] ../models/resource_harvester.rb:85:in `block (3 levels) in start'
[ERR] [2020-12-29 12:48:36] ../models/logged_process.rb:27:in `run_step'
[ERR] [2020-12-29 12:48:36] ../models/resource_harvester.rb:85:in `block (2 levels) in start'
[ERR] [2020-12-29 12:48:36] ../models/resource_harvester.rb:74:in `each_key'
[ERR] [2020-12-29 12:48:36] ../models/resource_harvester.rb:74:in `block in start'
[ERR] [2020-12-29 12:48:36] ../models/resource.rb:152:in `lock'
[ERR] [2020-12-29 12:48:36] ../models/resource_harvester.rb:72:in `start'
[ERR] [2020-12-29 12:48:36] ../models/resource.rb:238:in `harvest'
[ERR] [2020-12-29 12:48:36] ../models/resource.rb:214:in `re_download_opendata_and_harvest'
[ERR] [2020-12-29 12:48:36] bin/rails:4:in `require'
[ERR] [2020-12-29 12:48:36] bin/rails:4:in `<main>'
[STOP] [2020-12-29 12:48:36] logged process, took 15.0
[INFO] [2020-12-30 16:29:44] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2020-12-30 16:29:48] ## remove_type: ScientificName
[INFO] [2020-12-30 16:29:48] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-30 16:29:48] [16:29:48.121] Removed 0 Scientificnames
[INFO] [2020-12-30 16:29:48] ## remove_type: Vernacular
[INFO] [2020-12-30 16:29:48] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-30 16:29:48] [16:29:48.124] Removed 0 Vernaculars
[INFO] [2020-12-30 16:29:48] ## remove_type: Article
[INFO] [2020-12-30 16:29:48] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-30 16:29:48] [16:29:48.128] Removed 0 Articles
[INFO] [2020-12-30 16:29:48] ## remove_type: Medium
[INFO] [2020-12-30 16:29:48] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-30 16:29:48] [16:29:48.132] Removed 0 Media
[INFO] [2020-12-30 16:29:48] ## remove_type: Trait
[INFO] [2020-12-30 16:29:48] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-30 16:29:48] [16:29:48.135] Removed 0 Traits
[INFO] [2020-12-30 16:29:48] ## remove_type: MetaTrait
[INFO] [2020-12-30 16:29:48] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-30 16:29:48] [16:29:48.143] Removed 0 Metatraits
[INFO] [2020-12-30 16:29:48] ## remove_type: OccurrenceMetadatum
[INFO] [2020-12-30 16:29:48] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-30 16:29:48] [16:29:48.146] Removed 0 Occurrencemetadata
[INFO] [2020-12-30 16:29:48] ## remove_type: Assoc
[INFO] [2020-12-30 16:29:48] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-30 16:29:48] [16:29:48.149] Removed 0 Assocs
[INFO] [2020-12-30 16:29:48] ## remove_type: MetaAssoc
[INFO] [2020-12-30 16:29:48] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-30 16:29:48] [16:29:48.153] Removed 0 Metaassocs
[INFO] [2020-12-30 16:29:48] ## remove_type: Identifier
[INFO] [2020-12-30 16:29:48] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-30 16:29:48] [16:29:48.155] Removed 0 Identifiers
[INFO] [2020-12-30 16:29:48] ## remove_type: Reference
[INFO] [2020-12-30 16:29:48] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-30 16:29:48] [16:29:48.162] Removed 0 References
[INFO] [2020-12-30 16:29:48] ## remove_type: Node
[INFO] [2020-12-30 16:29:48] ++ Calling delete_all on 0 instances...
[INFO] [2020-12-30 16:29:48] [16:29:48.226] Removed 0 Nodes
[START] [2020-12-30 16:29:48] logged process: ca5be136aef877c71c74100a42de34a9e7a07645

[START] [2020-12-30 16:29:48] Creating resource from OpenData
[START] [2020-12-30 16:29:48] logged process: ca5be136aef877c71c74100a42de34a9e7a07645

[START] [2020-12-30 16:29:48] Parse meta.xml file and create formats with fields
[STOP] [2020-12-30 16:29:48] Parse meta.xml file and create formats with fields
[STOP] [2020-12-30 16:29:49] Creating resource from OpenData
[START] [2020-12-30 16:29:49] logged process: ca5be136aef877c71c74100a42de34a9e7a07645

[START] [2020-12-30 16:29:49] create_harvest_instance
[STOP] [2020-12-30 16:29:50] create_harvest_instance
[START] [2020-12-30 16:29:50] fetch_files
[STOP] [2020-12-30 16:29:50] fetch_files
[START] [2020-12-30 16:29:50] validate_each_file
[STOP] [2020-12-30 16:29:52] validate_each_file
[START] [2020-12-30 16:29:52] convert_to_csv
[CMD] [2020-12-30 16:29:52] /usr/bin/sort /app/public/converted_csv/ECOCROP_nodes_26067.csv > /app/public/converted_csv/ECOCROP_nodes_26067.csv_sorted
[CMD] [2020-12-30 16:29:52] /usr/bin/sort /app/public/converted_csv/ECOCROP_occurrences_26068.csv > /app/public/converted_csv/ECOCROP_occurrences_26068.csv_sorted
[CMD] [2020-12-30 16:29:52] /usr/bin/sort /app/public/converted_csv/ECOCROP_measurements_26069.csv > /app/public/converted_csv/ECOCROP_measurements_26069.csv_sorted
[STOP] [2020-12-30 16:29:52] convert_to_csv
[START] [2020-12-30 16:29:52] calculate_delta
[CMD] [2020-12-30 16:29:52] echo "0a" > /app/public/diff/ECOCROP_nodes_26067.diff
[CMD] [2020-12-30 16:29:52] tail -n +1 /app/public/converted_csv/ECOCROP_nodes_26067.csv >> /app/public/diff/ECOCROP_nodes_26067.diff
[CMD] [2020-12-30 16:29:52] echo "." >> /app/public/diff/ECOCROP_nodes_26067.diff
[CMD] [2020-12-30 16:29:52] echo "0a" > /app/public/diff/ECOCROP_occurrences_26068.diff
[CMD] [2020-12-30 16:29:52] tail -n +1 /app/public/converted_csv/ECOCROP_occurrences_26068.csv >> /app/public/diff/ECOCROP_occurrences_26068.diff
[CMD] [2020-12-30 16:29:52] echo "." >> /app/public/diff/ECOCROP_occurrences_26068.diff
[CMD] [2020-12-30 16:29:52] echo "0a" > /app/public/diff/ECOCROP_measurements_26069.diff
[CMD] [2020-12-30 16:29:52] tail -n +1 /app/public/converted_csv/ECOCROP_measurements_26069.csv >> /app/public/diff/ECOCROP_measurements_26069.diff
[CMD] [2020-12-30 16:29:52] echo "." >> /app/public/diff/ECOCROP_measurements_26069.diff
[STOP] [2020-12-30 16:29:52] calculate_delta
[START] [2020-12-30 16:29:52] parse_diff_and_store
[INFO] [2020-12-30 16:29:52] Loading nodes diff file into memory (true lines)...
[WARN] [2020-12-30 16:29:53] Filtered Scientific Name `Setaria sphacelata  var. sericea` to `Setaria sphacelata var. sericea`
[INFO] [2020-12-30 16:29:53] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-12-30 16:29:53] Loading measurements diff file into memory (true lines)...
[INFO] [2020-12-30 16:30:09] Storing 2770 ScientificNames
[INFO] [2020-12-30 16:30:09] Processing group of 2770 in 3 groups of 1000
[INFO] [2020-12-30 16:30:12] Average Time: 0.713
[INFO] [2020-12-30 16:30:12] Total Time: 3s
[INFO] [2020-12-30 16:30:12] Storing 2770 Nodes
[INFO] [2020-12-30 16:30:12] Processing group of 2770 in 3 groups of 1000
[INFO] [2020-12-30 16:30:13] Average Time: 0.49
[INFO] [2020-12-30 16:30:13] Total Time: 2s
[INFO] [2020-12-30 16:30:13] Storing 2567 Occurrences
[INFO] [2020-12-30 16:30:13] Processing group of 2567 in 3 groups of 1000
[INFO] [2020-12-30 16:30:13] Average Time: 0.087
[INFO] [2020-12-30 16:30:13] Total Time: 1s
[INFO] [2020-12-30 16:30:13] Storing 46351 Traits
[INFO] [2020-12-30 16:30:13] Processing group of 46351 in 47 groups of 1000
[INFO] [2020-12-30 16:30:28] Average Time: 0.305
[INFO] [2020-12-30 16:30:28] Total Time: 15s
[INFO] [2020-12-30 16:30:28] last 3 / first 3: 0.85
[INFO] [2020-12-30 16:30:28] Std.Dev: 0.06324555320336758; Max: 0.5
[INFO] [2020-12-30 16:30:28] Storing 33296 MetaTraits
[INFO] [2020-12-30 16:30:28] Processing group of 33296 in 34 groups of 1000
[INFO] [2020-12-30 16:30:32] Average Time: 0.106
[INFO] [2020-12-30 16:30:32] Total Time: 4s
[INFO] [2020-12-30 16:30:32] last 3 / first 3: 0.81
[INFO] [2020-12-30 16:30:32] Std.Dev: 0.0; Max: 0.18
[STOP] [2020-12-30 16:30:32] parse_diff_and_store
[START] [2020-12-30 16:30:32] resolve_keys
[INFO] [2020-12-30 16:31:01] Occurrences to nodes (through scientific_names)...
[INFO] [2020-12-30 16:31:16] traits to occurrences...
[INFO] [2020-12-30 16:31:17] traits to nodes (through occurrences)...
[INFO] [2020-12-30 16:31:18] Traits to sex term...
[INFO] [2020-12-30 16:31:18] Traits to lifestage term...
[INFO] [2020-12-30 16:31:18] MetaTraits to traits...
[INFO] [2020-12-30 16:31:19] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2020-12-30 16:31:20] Assocs to occurrences...
[INFO] [2020-12-30 16:31:20] Assocs to nodes...
[INFO] [2020-12-30 16:31:20] Assoc to sex term...
[INFO] [2020-12-30 16:31:20] Assoc to lifestage term...
[INFO] [2020-12-30 16:31:20] MetaAssoc to assocs...
[STOP] [2020-12-30 16:31:20] resolve_keys
[START] [2020-12-30 16:31:20] hold_for_later_1
[STOP] [2020-12-30 16:31:20] hold_for_later_1
[START] [2020-12-30 16:31:20] hold_for_later_2
[STOP] [2020-12-30 16:31:20] hold_for_later_2
[START] [2020-12-30 16:31:20] resolve_missing_parents
[STOP] [2020-12-30 16:31:20] resolve_missing_parents
[START] [2020-12-30 16:31:20] rebuild_nodes
[START] [2020-12-30 16:31:20] Flattener#flatten
[START] [2020-12-30 16:31:20] Flattener#study_resource
[START] [2020-12-30 16:31:20] Flattener#build_ancestry
[STOP] [2020-12-30 16:31:20] Flattener#build_ancestry
[INFO] [2020-12-30 16:31:20] 2770 ancestry keys
[START] [2020-12-30 16:31:20] build_node_ancestors
[INFO] [2020-12-30 16:31:20] old ancestors deleted.
[STOP] [2020-12-30 16:31:20] build_node_ancestors
[START] [2020-12-30 16:31:21] Flattener#propagate_ancestor_ids
[STOP] [2020-12-30 16:31:21] Flattener#propagate_ancestor_ids
[STOP] [2020-12-30 16:31:21] Flattener#flatten
[STOP] [2020-12-30 16:31:21] rebuild_nodes
[START] [2020-12-30 16:31:21] resolve_missing_media_owners
[STOP] [2020-12-30 16:31:21] resolve_missing_media_owners
[START] [2020-12-30 16:31:21] sanitize_media_verbatims
[STOP] [2020-12-30 16:31:21] sanitize_media_verbatims
[START] [2020-12-30 16:31:21] queue_downloads
[STOP] [2020-12-30 16:31:21] queue_downloads
[START] [2020-12-30 16:31:21] parse_names
[WARN] [2020-12-30 16:31:21] I see 2770 names which still need to be parsed.
[STOP] [2020-12-30 16:31:24] parse_names
[START] [2020-12-30 16:31:24] denormalize_canonical_names_to_nodes
[STOP] [2020-12-30 16:31:24] denormalize_canonical_names_to_nodes
[START] [2020-12-30 16:31:24] match_nodes
[START] [2020-12-30 16:31:24] map_all_nodes_to_pages
[STOP] [2020-12-30 16:33:31] map_all_nodes_to_pages
[INFO] [2020-12-30 16:33:31] 168 Unmatched nodes (of 2770)! That's too many to output. Full list in /app/public/data/ECOCROP/unmatched_nodes.txt ; First 10: Trifolium agrarium (#87459183); Trifolium cherlei (#87459188); Acacia anticeps (#87459955); Acacia macdonnelliensis (#87459994); Acacia millifera (#87460001); Acacia parchcarpa (#87460009); Acacia xiphohylla (#87460035); Albizia lucida (#87460117); Albizia stipulata (#87460265); Rhynchosia diversifolia prostata (#87460463)
[START] [2020-12-30 16:33:31] update_nodes
[STOP] [2020-12-30 16:33:32] update_nodes
[STOP] [2020-12-30 16:33:32] match_nodes
[START] [2020-12-30 16:33:32] reindex_search
[STOP] [2020-12-30 16:33:34] reindex_search
[START] [2020-12-30 16:33:34] normalize_units
[STOP] [2020-12-30 16:34:07] normalize_units
[START] [2020-12-30 16:34:07] calculate_statistics
[STOP] [2020-12-30 16:34:07] calculate_statistics
[START] [2020-12-30 16:34:07] complete_harvest_instance
[START] [2020-12-30 16:34:07] overall_tsv_creation
[INFO] [2020-12-30 16:34:07] Processing group of 2770 in 1 batches of 10000
[INFO] [2020-12-30 16:34:54] 30492 Traits (unfiltered)...
[INFO] [2020-12-30 16:37:20] 30492 Traits (filtered)...
[INFO] [2020-12-30 16:37:20] 0 Associations (filtered)...
[INFO] [2020-12-30 16:37:26] 49037 metadata added.
[INFO] [2020-12-30 16:37:26] 0 metadata added.
[INFO] [2020-12-30 16:38:01] Average Time: 207.23
[INFO] [2020-12-30 16:38:01] Total Time: 3m55s
[STOP] [2020-12-30 16:38:01] overall_tsv_creation
[INFO] [2020-12-30 16:38:01] Done. Check your files:
[INFO] [2020-12-30 16:38:01] (2770 lines) /app/public/data/ECOCROP/publish_nodes.tsv
[INFO] [2020-12-30 16:38:01] (5155 lines) /app/public/data/ECOCROP/publish_node_ancestors.tsv
[INFO] [2020-12-30 16:38:01] (2770 lines) /app/public/data/ECOCROP/publish_scientific_names.tsv
[INFO] [2020-12-30 16:38:01] (30493 lines) /app/public/data/ECOCROP/publish_traits.tsv
[INFO] [2020-12-30 16:38:01] (49038 lines) /app/public/data/ECOCROP/publish_metadata.tsv
[STOP] [2020-12-30 16:38:01] complete_harvest_instance
[START] [2020-12-30 16:38:01] completed
[STOP] [2020-12-30 16:38:01] completed
[STOP] [2020-12-30 16:38:01] logged process, took 492.81

Latest Process