Harvest for Eastfield College SEM Lab Created 28 Jan 11:04

Stage: completed
Fetched: 28 Jan 11:04
Validated: 28 Jan 11:04
Deltas Created 28 Jan 11:04
Units Normalized: 28 Jan 11:04
Ancestry Built: 28 Jan 11:04
Nodes Matched: 28 Jan 11:04
Names Parsed: 28 Jan 11:04
New Models Stored: 28 Jan 11:04
Indexed: 28 Jan 11:04
Completed: 28 Jan 11:05
Time to Harvest: less than a minute

Harvesting Log

(197 lines)
# Logfile created on 2021-01-28 11:04:19 -0500 by logger.rb/v1.4.2
[INFO] [2021-01-28 11:04:19] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2021-01-28 11:04:23] ## remove_type: ScientificName
[INFO] [2021-01-28 11:04:23] ++ Calling delete_all on 13 instances...
[INFO] [2021-01-28 11:04:23] [11:04:23.628] Removed 13 Scientificnames
[INFO] [2021-01-28 11:04:23] ## remove_type: Vernacular
[INFO] [2021-01-28 11:04:23] ++ Calling delete_all on 2 instances...
[INFO] [2021-01-28 11:04:23] [11:04:23.631] Removed 2 Vernaculars
[INFO] [2021-01-28 11:04:23] ## remove_type: Article
[INFO] [2021-01-28 11:04:23] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 11:04:23] [11:04:23.634] Removed 0 Articles
[INFO] [2021-01-28 11:04:23] ## remove_type: Medium
[INFO] [2021-01-28 11:04:23] ++ Calling delete_all on 133 instances...
[INFO] [2021-01-28 11:04:23] [11:04:23.673] Removed 133 Media
[INFO] [2021-01-28 11:04:23] ## remove_type: Trait
[INFO] [2021-01-28 11:04:23] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 11:04:23] [11:04:23.677] Removed 0 Traits
[INFO] [2021-01-28 11:04:23] ## remove_type: MetaTrait
[INFO] [2021-01-28 11:04:23] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 11:04:23] [11:04:23.683] Removed 0 Metatraits
[INFO] [2021-01-28 11:04:23] ## remove_type: OccurrenceMetadatum
[INFO] [2021-01-28 11:04:23] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 11:04:23] [11:04:23.707] Removed 0 Occurrencemetadata
[INFO] [2021-01-28 11:04:23] ## remove_type: Assoc
[INFO] [2021-01-28 11:04:23] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 11:04:23] [11:04:23.710] Removed 0 Assocs
[INFO] [2021-01-28 11:04:23] ## remove_type: MetaAssoc
[INFO] [2021-01-28 11:04:23] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 11:04:23] [11:04:23.713] Removed 0 Metaassocs
[INFO] [2021-01-28 11:04:23] ## remove_type: Identifier
[INFO] [2021-01-28 11:04:23] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 11:04:23] [11:04:23.716] Removed 0 Identifiers
[INFO] [2021-01-28 11:04:23] ## remove_type: Reference
[INFO] [2021-01-28 11:04:23] ++ Calling delete_all on 0 instances...
[INFO] [2021-01-28 11:04:23] [11:04:23.723] Removed 0 References
[INFO] [2021-01-28 11:04:23] Starting batch with ID 41000828...
[INFO] [2021-01-28 11:04:24] Starting batch with ID 41000828...
[INFO] [2021-01-28 11:04:24] ## remove_type: Node
[INFO] [2021-01-28 11:04:24] ++ Calling delete_all on 13 instances...
[INFO] [2021-01-28 11:04:24] [11:04:24.321] Removed 13 Nodes
[START] [2021-01-28 11:04:25] logged process: e38687fde360d9ce2f48dd55c3bf56ebb9d8efa6

[START] [2021-01-28 11:04:25] Creating resource from OpenData
[START] [2021-01-28 11:04:25] logged process: e38687fde360d9ce2f48dd55c3bf56ebb9d8efa6

[START] [2021-01-28 11:04:25] Parse meta.xml file and create formats with fields
[WARN] [2021-01-28 11:04:26] (common) IGNORED  (vernaculars) field header: countryCode term: http://rs.tdwg.org/dwc/terms/countryCode
[WARN] [2021-01-28 11:04:26] (common) IGNORED  (media) field header: derivedFrom term: http://rs.tdwg.org/ac/terms/derivedFrom
[WARN] [2021-01-28 11:04:26] (common) IGNORED  (media) field header: CreateDate term: http://ns.adobe.com/xap/1.0/CreateDate
[WARN] [2021-01-28 11:04:26] (common) IGNORED  (media) field header: modified term: http://purl.org/dc/terms/modified
[WARN] [2021-01-28 11:04:26] (common) IGNORED  (media) field header: Rating term: http://ns.adobe.com/xap/1.0/Rating
[WARN] [2021-01-28 11:04:26] (common) IGNORED  (media) field header: audience term: http://purl.org/dc/terms/audience
[WARN] [2021-01-28 11:04:26] (common) IGNORED  (media) field header: rights term: http://purl.org/dc/terms/rights
[WARN] [2021-01-28 11:04:26] (common) IGNORED  (media) field header: spatial term: http://purl.org/dc/terms/spatial
[WARN] [2021-01-28 11:04:26] (common) IGNORED  (media) field header: lat term: http://www.w3.org/2003/01/geo/wgs84_pos#lat
[WARN] [2021-01-28 11:04:26] (common) IGNORED  (media) field header: long term: http://www.w3.org/2003/01/geo/wgs84_pos#long
[WARN] [2021-01-28 11:04:26] (common) IGNORED  (media) field header: alt term: http://www.w3.org/2003/01/geo/wgs84_pos#alt
[STOP] [2021-01-28 11:04:26] Parse meta.xml file and create formats with fields
[STOP] [2021-01-28 11:04:26] Creating resource from OpenData
[START] [2021-01-28 11:04:26] logged process: e38687fde360d9ce2f48dd55c3bf56ebb9d8efa6

[START] [2021-01-28 11:04:26] create_harvest_instance
[STOP] [2021-01-28 11:04:30] create_harvest_instance
[START] [2021-01-28 11:04:30] fetch_files
[STOP] [2021-01-28 11:04:30] fetch_files
[START] [2021-01-28 11:04:30] validate_each_file
[STOP] [2021-01-28 11:04:30] validate_each_file
[START] [2021-01-28 11:04:30] convert_to_csv
[CMD] [2021-01-28 11:04:30] /usr/bin/sort /app/public/converted_csv/eastfield_sem_agents_26268.csv > /app/public/converted_csv/eastfield_sem_agents_26268.csv_sorted
[CMD] [2021-01-28 11:04:30] /usr/bin/sort /app/public/converted_csv/eastfield_sem_nodes_26269.csv > /app/public/converted_csv/eastfield_sem_nodes_26269.csv_sorted
[CMD] [2021-01-28 11:04:30] /usr/bin/sort /app/public/converted_csv/eastfield_sem_media_26270.csv > /app/public/converted_csv/eastfield_sem_media_26270.csv_sorted
[CMD] [2021-01-28 11:04:30] /usr/bin/sort /app/public/converted_csv/eastfield_sem_vernaculars_26271.csv > /app/public/converted_csv/eastfield_sem_vernaculars_26271.csv_sorted
[STOP] [2021-01-28 11:04:30] convert_to_csv
[START] [2021-01-28 11:04:30] calculate_delta
[CMD] [2021-01-28 11:04:30] echo "0a" > /app/public/diff/eastfield_sem_agents_26268.diff
[CMD] [2021-01-28 11:04:30] tail -n +1 /app/public/converted_csv/eastfield_sem_agents_26268.csv >> /app/public/diff/eastfield_sem_agents_26268.diff
[CMD] [2021-01-28 11:04:30] echo "." >> /app/public/diff/eastfield_sem_agents_26268.diff
[CMD] [2021-01-28 11:04:30] echo "0a" > /app/public/diff/eastfield_sem_nodes_26269.diff
[CMD] [2021-01-28 11:04:30] tail -n +1 /app/public/converted_csv/eastfield_sem_nodes_26269.csv >> /app/public/diff/eastfield_sem_nodes_26269.diff
[CMD] [2021-01-28 11:04:30] echo "." >> /app/public/diff/eastfield_sem_nodes_26269.diff
[CMD] [2021-01-28 11:04:30] echo "0a" > /app/public/diff/eastfield_sem_media_26270.diff
[CMD] [2021-01-28 11:04:30] tail -n +1 /app/public/converted_csv/eastfield_sem_media_26270.csv >> /app/public/diff/eastfield_sem_media_26270.diff
[CMD] [2021-01-28 11:04:30] echo "." >> /app/public/diff/eastfield_sem_media_26270.diff
[CMD] [2021-01-28 11:04:30] echo "0a" > /app/public/diff/eastfield_sem_vernaculars_26271.diff
[CMD] [2021-01-28 11:04:30] tail -n +1 /app/public/converted_csv/eastfield_sem_vernaculars_26271.csv >> /app/public/diff/eastfield_sem_vernaculars_26271.diff
[CMD] [2021-01-28 11:04:30] echo "." >> /app/public/diff/eastfield_sem_vernaculars_26271.diff
[STOP] [2021-01-28 11:04:30] calculate_delta
[START] [2021-01-28 11:04:30] parse_diff_and_store
[INFO] [2021-01-28 11:04:30] Loading agents diff file into memory (true lines)...
[INFO] [2021-01-28 11:04:30] Loading nodes diff file into memory (true lines)...
[INFO] [2021-01-28 11:04:30] Loading media diff file into memory (true lines)...
[INFO] [2021-01-28 11:04:30] Loading vernaculars diff file into memory (true lines)...
[INFO] [2021-01-28 11:04:30] Storing 4 Attributions
[INFO] [2021-01-28 11:04:30] Processing group of 4 in 1 groups of 1000
[INFO] [2021-01-28 11:04:30] Average Time: 0.0
[INFO] [2021-01-28 11:04:30] Total Time: 1s
[INFO] [2021-01-28 11:04:30] Storing 13 ScientificNames
[INFO] [2021-01-28 11:04:30] Processing group of 13 in 1 groups of 1000
[INFO] [2021-01-28 11:04:30] Average Time: 0.01
[INFO] [2021-01-28 11:04:30] Total Time: 1s
[INFO] [2021-01-28 11:04:30] Storing 13 Nodes
[INFO] [2021-01-28 11:04:30] Processing group of 13 in 1 groups of 1000
[INFO] [2021-01-28 11:04:30] Average Time: 0.01
[INFO] [2021-01-28 11:04:30] Total Time: 1s
[INFO] [2021-01-28 11:04:30] Storing 397 ContentAttributions
[INFO] [2021-01-28 11:04:30] Processing group of 397 in 1 groups of 1000
[INFO] [2021-01-28 11:04:30] Average Time: 0.05
[INFO] [2021-01-28 11:04:30] Total Time: 1s
[INFO] [2021-01-28 11:04:30] Storing 133 Media
[INFO] [2021-01-28 11:04:30] Processing group of 133 in 1 groups of 1000
[INFO] [2021-01-28 11:04:30] Average Time: 0.07
[INFO] [2021-01-28 11:04:30] Total Time: 1s
[INFO] [2021-01-28 11:04:30] Storing 18 BibliographicCitations
[INFO] [2021-01-28 11:04:30] Processing group of 18 in 1 groups of 1000
[INFO] [2021-01-28 11:04:30] Average Time: 0.01
[INFO] [2021-01-28 11:04:30] Total Time: 1s
[INFO] [2021-01-28 11:04:30] Storing 2 Vernaculars
[INFO] [2021-01-28 11:04:30] Processing group of 2 in 1 groups of 1000
[INFO] [2021-01-28 11:04:30] Average Time: 0.0
[INFO] [2021-01-28 11:04:30] Total Time: 1s
[STOP] [2021-01-28 11:04:30] parse_diff_and_store
[START] [2021-01-28 11:04:30] resolve_keys
[INFO] [2021-01-28 11:04:37] Occurrences to nodes (through scientific_names)...
[INFO] [2021-01-28 11:04:37] traits to occurrences...
[INFO] [2021-01-28 11:04:37] traits to nodes (through occurrences)...
[INFO] [2021-01-28 11:04:37] Traits to sex term...
[INFO] [2021-01-28 11:04:37] Traits to lifestage term...
[INFO] [2021-01-28 11:04:37] MetaTraits to traits...
[INFO] [2021-01-28 11:04:37] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2021-01-28 11:04:37] Assocs to occurrences...
[INFO] [2021-01-28 11:04:37] Assocs to nodes...
[INFO] [2021-01-28 11:04:37] Assoc to sex term...
[INFO] [2021-01-28 11:04:37] Assoc to lifestage term...
[INFO] [2021-01-28 11:04:37] MetaAssoc to assocs...
[STOP] [2021-01-28 11:04:37] resolve_keys
[START] [2021-01-28 11:04:37] hold_for_later_1
[STOP] [2021-01-28 11:04:37] hold_for_later_1
[START] [2021-01-28 11:04:37] hold_for_later_2
[STOP] [2021-01-28 11:04:37] hold_for_later_2
[START] [2021-01-28 11:04:37] resolve_missing_parents
[STOP] [2021-01-28 11:04:37] resolve_missing_parents
[START] [2021-01-28 11:04:37] rebuild_nodes
[START] [2021-01-28 11:04:37] Flattener#flatten
[START] [2021-01-28 11:04:37] Flattener#study_resource
[START] [2021-01-28 11:04:37] Flattener#build_ancestry
[STOP] [2021-01-28 11:04:37] Flattener#build_ancestry
[INFO] [2021-01-28 11:04:37] 13 ancestry keys
[START] [2021-01-28 11:04:37] build_node_ancestors
[INFO] [2021-01-28 11:04:37] old ancestors deleted.
[STOP] [2021-01-28 11:04:37] build_node_ancestors
[START] [2021-01-28 11:04:37] Flattener#propagate_ancestor_ids
[STOP] [2021-01-28 11:04:37] Flattener#propagate_ancestor_ids
[STOP] [2021-01-28 11:04:37] Flattener#flatten
[STOP] [2021-01-28 11:04:37] rebuild_nodes
[START] [2021-01-28 11:04:37] resolve_missing_media_owners
[STOP] [2021-01-28 11:04:37] resolve_missing_media_owners
[START] [2021-01-28 11:04:37] sanitize_media_verbatims
[STOP] [2021-01-28 11:04:37] sanitize_media_verbatims
[START] [2021-01-28 11:04:37] queue_downloads
[STOP] [2021-01-28 11:04:37] queue_downloads
[START] [2021-01-28 11:04:37] parse_names
[WARN] [2021-01-28 11:04:37] I see 13 names which still need to be parsed.
[STOP] [2021-01-28 11:04:38] parse_names
[START] [2021-01-28 11:04:38] denormalize_canonical_names_to_nodes
[STOP] [2021-01-28 11:04:38] denormalize_canonical_names_to_nodes
[START] [2021-01-28 11:04:38] match_nodes
[START] [2021-01-28 11:04:38] map_all_nodes_to_pages
[STOP] [2021-01-28 11:04:40] map_all_nodes_to_pages
[INFO] [2021-01-28 11:04:40] Unmatched nodes (3 of 13): Clogmia (#87652066); Clogmia albipunctata (#87652067); Therididae (#87652072)
[START] [2021-01-28 11:04:40] update_nodes
[STOP] [2021-01-28 11:04:40] update_nodes
[STOP] [2021-01-28 11:04:40] match_nodes
[START] [2021-01-28 11:04:40] reindex_search
[STOP] [2021-01-28 11:04:41] reindex_search
[START] [2021-01-28 11:04:41] normalize_units
[STOP] [2021-01-28 11:04:41] normalize_units
[START] [2021-01-28 11:04:41] calculate_statistics
[STOP] [2021-01-28 11:04:41] calculate_statistics
[START] [2021-01-28 11:04:41] complete_harvest_instance
[START] [2021-01-28 11:04:41] overall_tsv_creation
[INFO] [2021-01-28 11:04:41] Processing group of 13 in 1 batches of 10000
[INFO] [2021-01-28 11:05:15] Average Time: 6.8
[INFO] [2021-01-28 11:05:15] Total Time: 35s
[STOP] [2021-01-28 11:05:15] overall_tsv_creation
[INFO] [2021-01-28 11:05:15] Done. Check your files:
[INFO] [2021-01-28 11:05:15] (18 lines) /app/public/data/eastfield_sem/publish_bibliographic_citations.tsv
[INFO] [2021-01-28 11:05:15] (13 lines) /app/public/data/eastfield_sem/publish_nodes.tsv
[INFO] [2021-01-28 11:05:15] (25 lines) /app/public/data/eastfield_sem/publish_node_ancestors.tsv
[INFO] [2021-01-28 11:05:15] (13 lines) /app/public/data/eastfield_sem/publish_scientific_names.tsv
[INFO] [2021-01-28 11:05:15] (133 lines) /app/public/data/eastfield_sem/publish_media.tsv
[INFO] [2021-01-28 11:05:15] (20 lines) /app/public/data/eastfield_sem/publish_image_info.tsv
[INFO] [2021-01-28 11:05:15] (2 lines) /app/public/data/eastfield_sem/publish_vernaculars.tsv
[INFO] [2021-01-28 11:05:15] (397 lines) /app/public/data/eastfield_sem/publish_attributions.tsv
[STOP] [2021-01-28 11:05:15] complete_harvest_instance
[START] [2021-01-28 11:05:15] completed
[STOP] [2021-01-28 11:05:15] completed
[STOP] [2021-01-28 11:05:15] logged process, took 49.3

Latest Process