Harvest for Femorale Created 28 May 12:42

Stage: completed
Fetched: 28 May 12:42
Validated: 28 May 12:42
Deltas Created 28 May 12:42
Units Normalized: 28 May 13:18
Ancestry Built: 28 May 12:45
Nodes Matched: 28 May 13:16
Names Parsed: 28 May 12:46
New Models Stored: 28 May 12:45
Indexed: 28 May 13:16
Completed: 28 May 13:27
Time to Harvest: 1 minute

Harvesting Log

(211 lines)
[INFO] [2021-05-28 12:42:34] Created harvest instance #3903
[STOP] [2021-05-28 12:42:34] create_harvest_instance
[START] [2021-05-28 12:42:34] fetch_files
[STOP] [2021-05-28 12:42:34] fetch_files
[START] [2021-05-28 12:42:34] validate_each_file
[INFO] [2021-05-28 12:42:34] Looping over 4 formats...
[INFO] [2021-05-28 12:42:34] ...nodes (/app/public/data/Femorale/taxon.tab)
[INFO] [2021-05-28 12:42:34] Valid: /app/public/converted_csv/Femorale_nodes_3903.csv (22173 lines)
[INFO] [2021-05-28 12:42:34] ...media (/app/public/data/Femorale/media_resource.tab)
[INFO] [2021-05-28 12:42:40] Valid: /app/public/converted_csv/Femorale_media_3903.csv (122322 lines)
[INFO] [2021-05-28 12:42:40] ...occurrences (/app/public/data/Femorale/occurrence.tab)
[INFO] [2021-05-28 12:42:40] Valid: /app/public/converted_csv/Femorale_occurrences_3903.csv (22171 lines)
[INFO] [2021-05-28 12:42:40] ...measurements (/app/public/data/Femorale/measurement_or_fact.tab)
[INFO] [2021-05-28 12:42:41] Valid: /app/public/converted_csv/Femorale_measurements_3903.csv (22171 lines)
[STOP] [2021-05-28 12:42:41] validate_each_file
[START] [2021-05-28 12:42:42] convert_to_csv
[INFO] [2021-05-28 12:42:42] Looping over 4 formats...
[INFO] [2021-05-28 12:42:42] ...nodes (/app/public/data/Femorale/taxon.tab)
[CMD] [2021-05-28 12:42:42] /usr/bin/sort /app/public/converted_csv/Femorale_nodes_3903.csv > /app/public/converted_csv/Femorale_nodes_3903.csv_sorted
[INFO] [2021-05-28 12:42:42] Converted: /app/public/converted_csv/Femorale_nodes_3903.csv (22173 lines)
[INFO] [2021-05-28 12:42:42] ...media (/app/public/data/Femorale/media_resource.tab)
[CMD] [2021-05-28 12:42:42] /usr/bin/sort /app/public/converted_csv/Femorale_media_3903.csv > /app/public/converted_csv/Femorale_media_3903.csv_sorted
[INFO] [2021-05-28 12:42:43] Converted: /app/public/converted_csv/Femorale_media_3903.csv (122322 lines)
[INFO] [2021-05-28 12:42:43] ...occurrences (/app/public/data/Femorale/occurrence.tab)
[CMD] [2021-05-28 12:42:43] /usr/bin/sort /app/public/converted_csv/Femorale_occurrences_3903.csv > /app/public/converted_csv/Femorale_occurrences_3903.csv_sorted
[INFO] [2021-05-28 12:42:43] Converted: /app/public/converted_csv/Femorale_occurrences_3903.csv (22171 lines)
[INFO] [2021-05-28 12:42:43] ...measurements (/app/public/data/Femorale/measurement_or_fact.tab)
[CMD] [2021-05-28 12:42:43] /usr/bin/sort /app/public/converted_csv/Femorale_measurements_3903.csv > /app/public/converted_csv/Femorale_measurements_3903.csv_sorted
[INFO] [2021-05-28 12:42:43] Converted: /app/public/converted_csv/Femorale_measurements_3903.csv (22171 lines)
[STOP] [2021-05-28 12:42:43] convert_to_csv
[START] [2021-05-28 12:42:44] calculate_delta
[INFO] [2021-05-28 12:42:44] Looping over 4 formats...
[INFO] [2021-05-28 12:42:44] ...nodes (/app/public/data/Femorale/taxon.tab)
[CMD] [2021-05-28 12:42:44] echo "0a" > /app/public/diff/Femorale_nodes_3903.diff
[CMD] [2021-05-28 12:42:44] tail -n +1 /app/public/converted_csv/Femorale_nodes_3903.csv >> /app/public/diff/Femorale_nodes_3903.diff
[CMD] [2021-05-28 12:42:44] echo "." >> /app/public/diff/Femorale_nodes_3903.diff
[INFO] [2021-05-28 12:42:45] Created diff: /app/public/diff/Femorale_nodes_3903.diff (22175 lines)
[INFO] [2021-05-28 12:42:45] ...media (/app/public/data/Femorale/media_resource.tab)
[CMD] [2021-05-28 12:42:45] echo "0a" > /app/public/diff/Femorale_media_3903.diff
[CMD] [2021-05-28 12:42:45] tail -n +1 /app/public/converted_csv/Femorale_media_3903.csv >> /app/public/diff/Femorale_media_3903.diff
[CMD] [2021-05-28 12:42:45] echo "." >> /app/public/diff/Femorale_media_3903.diff
[INFO] [2021-05-28 12:42:46] Created diff: /app/public/diff/Femorale_media_3903.diff (122324 lines)
[INFO] [2021-05-28 12:42:46] ...occurrences (/app/public/data/Femorale/occurrence.tab)
[CMD] [2021-05-28 12:42:46] echo "0a" > /app/public/diff/Femorale_occurrences_3903.diff
[CMD] [2021-05-28 12:42:46] tail -n +1 /app/public/converted_csv/Femorale_occurrences_3903.csv >> /app/public/diff/Femorale_occurrences_3903.diff
[CMD] [2021-05-28 12:42:47] echo "." >> /app/public/diff/Femorale_occurrences_3903.diff
[INFO] [2021-05-28 12:42:47] Created diff: /app/public/diff/Femorale_occurrences_3903.diff (22173 lines)
[INFO] [2021-05-28 12:42:47] ...measurements (/app/public/data/Femorale/measurement_or_fact.tab)
[CMD] [2021-05-28 12:42:47] echo "0a" > /app/public/diff/Femorale_measurements_3903.diff
[CMD] [2021-05-28 12:42:47] tail -n +1 /app/public/converted_csv/Femorale_measurements_3903.csv >> /app/public/diff/Femorale_measurements_3903.diff
[CMD] [2021-05-28 12:42:47] echo "." >> /app/public/diff/Femorale_measurements_3903.diff
[INFO] [2021-05-28 12:42:48] Created diff: /app/public/diff/Femorale_measurements_3903.diff (22173 lines)
[STOP] [2021-05-28 12:42:48] calculate_delta
[START] [2021-05-28 12:42:48] parse_diff_and_store
[INFO] [2021-05-28 12:42:48] Handling diff: /app/public/diff/Femorale_nodes_3903.diff (22175 lines)
[INFO] [2021-05-28 12:42:48] Loading nodes diff file into memory (22175 /app/public/diff/Femorale_nodes_3903.diff lines)...
[WARN] [2021-05-28 12:42:48] Filtered Scientific Name `Vespericola megasoma ("Dall" Pilsbry, 1928)` to `Vespericola megasoma (Dall Pilsbry, 1928)`
[WARN] [2021-05-28 12:42:48] Filtered Scientific Name `Helicostyla gilva (Sowerby "Pfeiffer", 1845)` to `Helicostyla gilva (Sowerby Pfeiffer, 1845)`
[WARN] [2021-05-28 12:42:49] Filtered Scientific Name `Mutela bourguignati "Ancey" Bourguinat, 1885` to `Mutela bourguignati Ancey Bourguinat, 1885`
[WARN] [2021-05-28 12:42:49] Filtered Scientific Name `Hexaplex cf.  kuesterianus (Tapparone-Canepi, 1875)` to `Hexaplex cf. kuesterianus (Tapparone-Canepi, 1875)`
[WARN] [2021-05-28 12:42:50] Filtered Scientific Name `Vermetus alli  Hadfield & Kay in Hadfield, Kay, Gillette & Lloyd, 1972` to `Vermetus alli Hadfield & Kay in Hadfield, Kay, Gillette & Lloyd, 1972`
[WARN] [2021-05-28 12:42:51] Filtered Scientific Name `Capulus cf.  galeus Dall, 1889` to `Capulus cf. galeus Dall, 1889`
[WARN] [2021-05-28 12:42:51] Filtered Scientific Name `Cypraea carneola propinqua,  sulcidentata and hybrid` to `Cypraea carneola propinqua, sulcidentata and hybrid`
[WARN] [2021-05-28 12:42:51] Filtered Scientific Name `Conus amadis aurantia "Lamarck" Dautzenberg, 1937` to `Conus amadis aurantia Lamarck Dautzenberg, 1937`
[WARN] [2021-05-28 12:42:52] Filtered Scientific Name `Cypraea nebrites "galapaginensis"` to `Cypraea nebrites galapaginensis`
[WARN] [2021-05-28 12:42:52] Filtered Scientific Name `Amphidromus sinistralis lutea "Martens" Fulton, 1896` to `Amphidromus sinistralis lutea Martens Fulton, 1896`
[WARN] [2021-05-28 12:42:52] Filtered Scientific Name `Neptunea polycostata aino Fraussen &  Terryn, 2007` to `Neptunea polycostata aino Fraussen & Terryn, 2007`
[WARN] [2021-05-28 12:42:53] Filtered Scientific Name `Scaevatula pellisserpentis Gofas,\n1990` to `Scaevatula pellisserpentis Gofas,n1990`
[WARN] [2021-05-28 12:42:54] Filtered Scientific Name `Cypraea pantherina albonitens \nMelvill, 1888` to `Cypraea pantherina albonitens nMelvill, 1888`
[WARN] [2021-05-28 12:42:54] Filtered Scientific Name `Scaeochlamys squamea  Dijkstra & Maestrati, 2009` to `Scaeochlamys squamea Dijkstra & Maestrati, 2009`
[WARN] [2021-05-28 12:42:57] Filtered Scientific Name `Porphyrobaphe grevillei ("Sowerby" Pfeiffer, 1876)` to `Porphyrobaphe grevillei (Sowerby Pfeiffer, 1876)`
[INFO] [2021-05-28 12:42:58] Handling diff: /app/public/diff/Femorale_media_3903.diff (122324 lines)
[INFO] [2021-05-28 12:42:58] Loading media diff file into memory (122324 /app/public/diff/Femorale_media_3903.diff lines)...
[INFO] [2021-05-28 12:43:32] Handling diff: /app/public/diff/Femorale_occurrences_3903.diff (22173 lines)
[INFO] [2021-05-28 12:43:33] Loading occurrences diff file into memory (22173 /app/public/diff/Femorale_occurrences_3903.diff lines)...
[INFO] [2021-05-28 12:43:38] Handling diff: /app/public/diff/Femorale_measurements_3903.diff (22173 lines)
[INFO] [2021-05-28 12:43:39] Loading measurements diff file into memory (22173 /app/public/diff/Femorale_measurements_3903.diff lines)...
[INFO] [2021-05-28 12:43:49] Storing 22524 ScientificNames
[INFO] [2021-05-28 12:43:49] Processing group of 22524 in 23 groups of 1000
[INFO] [2021-05-28 12:43:58] Average Time: 0.397
[INFO] [2021-05-28 12:43:58] Total Time: 10s
[INFO] [2021-05-28 12:43:58] last 3 / first 3: 0.8
[INFO] [2021-05-28 12:43:58] Std.Dev: 0.22135943621178655; Max: 0.98
[INFO] [2021-05-28 12:43:58] Storing 22524 Nodes
[INFO] [2021-05-28 12:43:58] Processing group of 22524 in 23 groups of 1000
[INFO] [2021-05-28 12:44:04] Average Time: 0.267
[INFO] [2021-05-28 12:44:04] Total Time: 7s
[INFO] [2021-05-28 12:44:04] last 3 / first 3: 0.86
[INFO] [2021-05-28 12:44:04] Std.Dev: 0.03162277660168379; Max: 0.35
[INFO] [2021-05-28 12:44:04] Storing 122322 Media
[INFO] [2021-05-28 12:44:04] Processing group of 122322 in 123 groups of 1000
[INFO] [2021-05-28 12:44:52] Average Time: 0.378
[INFO] [2021-05-28 12:44:52] Total Time: 48s
[INFO] [2021-05-28 12:44:52] last 3 / first 3: 0.7
[INFO] [2021-05-28 12:44:52] Std.Dev: 0.12649110640673517; Max: 1.23
[INFO] [2021-05-28 12:44:52] Storing 22171 Occurrences
[INFO] [2021-05-28 12:44:52] Processing group of 22171 in 23 groups of 1000
[INFO] [2021-05-28 12:44:54] Average Time: 0.115
[INFO] [2021-05-28 12:44:54] Total Time: 3s
[INFO] [2021-05-28 12:44:54] last 3 / first 3: 1.03
[INFO] [2021-05-28 12:44:54] Std.Dev: 0.03162277660168379; Max: 0.21
[INFO] [2021-05-28 12:44:54] Storing 22171 OccurrenceMetadata
[INFO] [2021-05-28 12:44:54] Processing group of 22171 in 23 groups of 1000
[INFO] [2021-05-28 12:44:57] Average Time: 0.113
[INFO] [2021-05-28 12:44:57] Total Time: 3s
[INFO] [2021-05-28 12:44:57] last 3 / first 3: 0.69
[INFO] [2021-05-28 12:44:57] Std.Dev: 0.03162277660168379; Max: 0.21
[INFO] [2021-05-28 12:44:57] Storing 22171 Traits
[INFO] [2021-05-28 12:44:57] Processing group of 22171 in 23 groups of 1000
[INFO] [2021-05-28 12:45:05] Average Time: 0.323
[INFO] [2021-05-28 12:45:05] Total Time: 8s
[INFO] [2021-05-28 12:45:05] last 3 / first 3: 0.66
[INFO] [2021-05-28 12:45:05] Std.Dev: 0.08366600265340755; Max: 0.54
[INFO] [2021-05-28 12:45:05] Storing 22171 MetaTraits
[INFO] [2021-05-28 12:45:05] Processing group of 22171 in 23 groups of 1000
[INFO] [2021-05-28 12:45:07] Average Time: 0.113
[INFO] [2021-05-28 12:45:07] Total Time: 3s
[INFO] [2021-05-28 12:45:07] last 3 / first 3: 0.59
[INFO] [2021-05-28 12:45:07] Std.Dev: 0.03162277660168379; Max: 0.2
[STOP] [2021-05-28 12:45:07] parse_diff_and_store
[START] [2021-05-28 12:45:07] resolve_keys
[INFO] [2021-05-28 12:45:28] Occurrences to nodes (through scientific_names)...
[INFO] [2021-05-28 12:45:29] traits to occurrences...
[INFO] [2021-05-28 12:45:30] traits to nodes (through occurrences)...
[INFO] [2021-05-28 12:45:30] Traits to sex term...
[INFO] [2021-05-28 12:45:31] Traits to lifestage term...
[INFO] [2021-05-28 12:45:31] MetaTraits to traits...
[INFO] [2021-05-28 12:45:32] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2021-05-28 12:45:32] Assocs to occurrences...
[INFO] [2021-05-28 12:45:32] Assocs to nodes...
[INFO] [2021-05-28 12:45:32] Assoc to sex term...
[INFO] [2021-05-28 12:45:32] Assoc to lifestage term...
[INFO] [2021-05-28 12:45:32] MetaAssoc to assocs...
[STOP] [2021-05-28 12:45:32] resolve_keys
[START] [2021-05-28 12:45:32] hold_for_later_1
[STOP] [2021-05-28 12:45:32] hold_for_later_1
[START] [2021-05-28 12:45:32] hold_for_later_2
[STOP] [2021-05-28 12:45:32] hold_for_later_2
[START] [2021-05-28 12:45:32] resolve_missing_parents
[STOP] [2021-05-28 12:45:32] resolve_missing_parents
[START] [2021-05-28 12:45:33] rebuild_nodes
[START] [2021-05-28 12:45:33] Flattener#flatten
[START] [2021-05-28 12:45:33] Flattener#study_resource
[START] [2021-05-28 12:45:33] Flattener#build_ancestry
[STOP] [2021-05-28 12:45:34] Flattener#build_ancestry
[INFO] [2021-05-28 12:45:34] 22524 ancestry keys
[START] [2021-05-28 12:45:34] build_node_ancestors
[INFO] [2021-05-28 12:45:34] old ancestors deleted.
[STOP] [2021-05-28 12:45:37] build_node_ancestors
[START] [2021-05-28 12:45:41] Flattener#propagate_ancestor_ids
[STOP] [2021-05-28 12:45:42] Flattener#propagate_ancestor_ids
[STOP] [2021-05-28 12:45:42] Flattener#flatten
[STOP] [2021-05-28 12:45:42] rebuild_nodes
[START] [2021-05-28 12:45:42] resolve_missing_media_owners
[STOP] [2021-05-28 12:45:42] resolve_missing_media_owners
[START] [2021-05-28 12:45:42] sanitize_media_verbatims
[STOP] [2021-05-28 12:45:42] sanitize_media_verbatims
[START] [2021-05-28 12:45:42] queue_downloads
[STOP] [2021-05-28 12:45:43] queue_downloads
[START] [2021-05-28 12:45:43] parse_names
[WARN] [2021-05-28 12:45:43] I see 22524 names which still need to be parsed.
[WARN] [2021-05-28 12:46:04] I see 2 names which still need to be parsed.
[STOP] [2021-05-28 12:46:05] parse_names
[START] [2021-05-28 12:46:05] denormalize_canonical_names_to_nodes
[STOP] [2021-05-28 12:46:05] denormalize_canonical_names_to_nodes
[START] [2021-05-28 12:46:05] match_nodes
[START] [2021-05-28 12:46:05] map_all_nodes_to_pages
[STOP] [2021-05-28 13:16:17] map_all_nodes_to_pages
[INFO] [2021-05-28 13:16:17] 1914 Unmatched nodes (of 22524)! That's too many to output. Full list in /app/public/data/Femorale/unmatched_nodes.txt ; First 10: Canonical: Aegista awajiensis; Node#95008125; ResourceID: 030ce6a17d6f8f152c1458cc0f82c869; Canonical: Phoenicobius campanulus; Node#95008184; ResourceID: 03aa705f4b6bd0fda9221034c35f8fad; Canonical: Calocochlia; Node#95010508; ResourceID: 1ced2ff86f292269ea31ed3aa5680b21; Canonical: Nesiohelix kanoi; Node#95011223; ResourceID: 257b5dcccd97c7101fbb0e7db836b475; Canonical: Aegista proba goniosoma; Node#95011593; ResourceID: 292fc24f9ef653d1a740a9fdd3022bc6; Canonical: Ainohelix editha; Node#95011821; ResourceID: 2bf7bfba94c6f4dee9910416e96e6f53; Canonical: Euhadra senckenbergiana notoensis; Node#95011924; ResourceID: 2cfb7c6fddd54b81fc92136c15d74495; Canonical: Aegista caerulea; Node#95012245; ResourceID: 30f049fc3b4323c2702e62aa27e1653d; Canonical: Euhadra latispira yagurai; Node#95012350; ResourceID: 32240fc142dd8f3ee588befc1c3cf0b2; Canonical: Euhadra nachicola; Node#95012386; ResourceID: 32811a9e85884be25ca5f3f20a827f7c
[START] [2021-05-28 13:16:17] update_nodes
[STOP] [2021-05-28 13:16:27] update_nodes
[STOP] [2021-05-28 13:16:27] match_nodes
[START] [2021-05-28 13:16:27] reindex_search
[STOP] [2021-05-28 13:16:48] reindex_search
[START] [2021-05-28 13:16:48] normalize_units
[STOP] [2021-05-28 13:18:11] normalize_units
[START] [2021-05-28 13:18:11] calculate_statistics
[STOP] [2021-05-28 13:18:11] calculate_statistics
[START] [2021-05-28 13:18:11] complete_harvest_instance
[START] [2021-05-28 13:18:11] overall_tsv_creation
[INFO] [2021-05-28 13:18:11] Processing group of 22524 in 3 batches of 10000
[INFO] [2021-05-28 13:20:16] 9695 Traits (unfiltered)...
[INFO] [2021-05-28 13:21:08] 9695 Traits (filtered)...
[INFO] [2021-05-28 13:21:11] 0 Associations (filtered)...
[INFO] [2021-05-28 13:21:12] 0 metadata added.
[INFO] [2021-05-28 13:21:12] 0 metadata added.
[INFO] [2021-05-28 13:24:02] 9957 Traits (unfiltered)...
[INFO] [2021-05-28 13:24:55] 9957 Traits (filtered)...
[INFO] [2021-05-28 13:24:58] 0 Associations (filtered)...
[INFO] [2021-05-28 13:24:59] 0 metadata added.
[INFO] [2021-05-28 13:24:59] 0 metadata added.
[INFO] [2021-05-28 13:26:33] 2519 Traits (unfiltered)...
[INFO] [2021-05-28 13:27:09] 2519 Traits (filtered)...
[INFO] [2021-05-28 13:27:09] 0 Associations (filtered)...
[INFO] [2021-05-28 13:27:10] 0 metadata added.
[INFO] [2021-05-28 13:27:10] 0 metadata added.
[INFO] [2021-05-28 13:27:34] Average Time: 146.907
[INFO] [2021-05-28 13:27:34] Total Time: 9m23s
[STOP] [2021-05-28 13:27:34] overall_tsv_creation
[INFO] [2021-05-28 13:27:34] Done. Check your files:
[INFO] [2021-05-28 13:27:34] (22523 lines) /app/public/data/Femorale/publish_nodes.tsv
[INFO] [2021-05-28 13:27:35] (67216 lines) /app/public/data/Femorale/publish_node_ancestors.tsv
[INFO] [2021-05-28 13:27:35] (22524 lines) /app/public/data/Femorale/publish_scientific_names.tsv
[INFO] [2021-05-28 13:27:35] (122322 lines) /app/public/data/Femorale/publish_media.tsv
[INFO] [2021-05-28 13:27:36] (22924 lines) /app/public/data/Femorale/publish_image_info.tsv
[INFO] [2021-05-28 13:27:36] (22172 lines) /app/public/data/Femorale/publish_traits.tsv
[INFO] [2021-05-28 13:27:36] (1 lines) /app/public/data/Femorale/publish_metadata.tsv
[STOP] [2021-05-28 13:27:36] complete_harvest_instance
[START] [2021-05-28 13:27:36] completed
[STOP] [2021-05-28 13:27:36] completed
[STOP] [2021-05-28 13:27:36] logged process, took 2702.91

Latest Process