SpRadIE targets Named Entity Recognition and Classification in the domain of radiology reports in Spanish, more concretely, ultrasounds.
Seven different classes of concepts in the radiology domain are distinguished. Since these entities refer to very precise, complex concepts, they are realized by correspondingly complex textual forms. Entities may be very long, sometimes even spanning over sentence boundaries, embedded within other entities of different types and may be discontinuous. Moreover, different text strings may be used to refer to the same entity, including abbreviations and typos.
The following entities are distinguished:
Anatomical Entity,
Finding,
Location,
Measure,
Type of Measure,
Degree,
Abbreviation.
Hedge cues are also identified, distinguishing:
Negation,
Uncertainty,
Conditional Temporal.
The entity type Finding is particularly challenging, as it presents great variability in its textual forms. It ranges from a single word to more than 10 words in some cases, and comprising all kinds of phrases. However, this is also the most informative type of entity for the potential users of these annotations. Other challenging phenomena are the regular polysemy observed between Anatomical entities and Locations, and the irregular uses of Abbreviations. In the manual annotation process, we have found that human annotators differ more on those categories than on the others, thus we expect automatic annotators will also have difficulties to consistently classify those as well.
Entities are formed by a word or a sequence of words, not necessarily continuous, and entities can be embedded within other entities.
Entities corresponding to an anatomical part, for example breast (pecho), liver (hígado), right thyroid lobe (lóbulo tiroideo derecho)
vejiga llena
(full bladder)
A pathological finding or diagnosis, for example: cyst, cyanosis.
No se detectaron adenomegalias
(No adenomegalies were detected)
It refers to a location in the body. The location could by itself indicate of which part of the body it is being talked about or it could have a relation to an anatomical entity. Examples of locations are: walls, cavity, longitudinal, frontal, occipital, cervicodorsolumbosacra, lumbosacral, intracanalar, subcutanea.
quistes en región biliar
(cysts in biliary region)
Expression indicating a measure or a kind of measure.
Diametro longitudinal: 8.1 cm.
(Longitudinal diameter: 8.1 cm.)
It indicates the degree of a finding or some other property of an entity, for example, “leve”, “levemente” (slight), “mínimo” (minimal).
ligera esplenomegalia
(slight splenomegaly)
leve cambio de la ecogenicidad de la grasa
(slight change in fat echogenicity)
Hedge cues indicating negation.
No se detectaron adenomegalias
(No adenomegalies were detected)
Hedge cues indicating that something occurred in the past or may occur in the future. Also indicating a conditional form.
antecedentes de atresia
(history of atresia)
Si baja de peso
(If she loses weight)
Hedge cues indicating a probability (not a certainty) that some finding may be present in a given patient.
compatible con hipertrofia pilórica
(compatible with pyloric hypertrophy)
Disminución de tamaño a expensas predominantemente del lóbulo derecho.
(Decrease in size predominantly at the expense of the right lobe.)
Si fuera apendicitis sería retrocecal.
(If it were appendicitis, it would be retrocecal.)
Vía biliar intra y extrahepática: no dilatada.
(Intra and extrahepatic bile duct: not dilated.)
Apéndice aumentado de tamaño con apendicolito en su interior.
(Enlarged appendix with appendicolith inside.)
Aumento de la ecogenicidad parenquimatosa y menor diferenciación corticomedular.
(Increased parenchymal echogenicity and decreased corticomedullary differentiation.)
Formación de aspecto sólido con escasa vascularización e imagen quística.
(Solid-looking formation with poor vascularization and cystic image.)
Imágenes anecoicas simples (quísticas) de paredes finas.
(Simple anechoic (cystic) thin-walled images.)
Espesor del músculo 0.5cm.
(Muscle thickness 0.5cm.)
Longitud de canal pilórico 0.9cm.
(Pyloric canal length 0.9cm.)
Páncreas (cabeza).
(Pancreas (head).)