Další formáty:
BibTeX
LaTeX
RIS
@inproceedings{1790654, author = {Sotolář, Ondřej and Plhák, Jaromír and Šmahel, David}, address = {Cham}, booktitle = {Text, Speech, and Dialogue}, doi = {http://dx.doi.org/10.1007/978-3-030-83527-9_24}, editor = {Kamil Ekštein, František Pártl, Miloslav Konopík}, keywords = {Text anonymization; Personal data; Sanitization; De-identification; Privacy protection}, howpublished = {tištěná verze "print"}, language = {eng}, location = {Cham}, isbn = {978-3-030-83526-2}, pages = {281-292}, publisher = {Springer, Cham}, title = {Towards Personal Data Anonymization for Social Messaging}, url = {https://link.springer.com/chapter/10.1007/978-3-030-83527-9_24}, year = {2021} }
TY - JOUR ID - 1790654 AU - Sotolář, Ondřej - Plhák, Jaromír - Šmahel, David PY - 2021 TI - Towards Personal Data Anonymization for Social Messaging PB - Springer, Cham CY - Cham SN - 9783030835262 KW - Text anonymization KW - Personal data KW - Sanitization KW - De-identification KW - Privacy protection UR - https://link.springer.com/chapter/10.1007/978-3-030-83527-9_24 N2 - We present a method for building text corpora for the supervised learning of text-to-text anonymization while maintaining a strict privacy policy. In our solution, personal data entities are detected, classified, and anonymized. We use available machine-learning methods, like named-entity recognition, and improve their performance by grouping multiple entities into larger units based on the theory of tabular data anonymization. Experimental results on annotated Czech Facebook Messenger conversations reveal that our solution has recall comparable to human annotators. On the other hand, precision is much lower because of the low efficiency of the named entity recognition in the domain of social messaging conversations. The resulting anonymized text is of high utility because of the replacement methods that produce natural text. ER -
SOTOLÁŘ, Ondřej, Jaromír PLHÁK a David ŠMAHEL. Towards Personal Data Anonymization for Social Messaging. In Kamil Ekštein, František Pártl, Miloslav Konopík. \textit{Text, Speech, and Dialogue}. Cham: Springer, Cham, 2021, s.~281-292. ISBN~978-3-030-83526-2. Dostupné z: https://dx.doi.org/10.1007/978-3-030-83527-9\_{}24.
|