Héctor {Martínez Alonso}

My research interests are centered on lexical semantics, dependency syntax, semi-supervised learning and linguistic annotation.
You can contact me at hector DOT martinez DOT a AT gmail DOT com or at my LinkedIn profile.

Data and tools

  • Regpol The regular-polysemy datasets in English, Danish and Spanish from my doctoral dissertation, and the Danish supersense-annotated SemDax corpus, as well as a supersense tagger trained from it.
  • I have been involved in the conversion of the Catalan, Danish and Spanish treebanks of Universal Dependencies UD.
  • Here is also the unsupervised dependency parser for UD from EACL’17.
  • In my fairly messy github contributions you can find smalltools, a collection of little Python scripts, including a bootstrap-sample significance tester which keeps coming in handy.


  1. Gantar P, Colman L, Parra Escartín C, Martínez Alonso H. Multiword Expressions: Between Lexicography and NLP. International Journal of Lexicography. August 2018.
  2. Seddah D, de La Clergerie E, Sagot B, Martínez Alonso H, Candito M, Cheating a Parser to Death: Data-driven Cross-Treebank Annotation Transfer. LREC 2018
  3. Schumman A, Martínez Alonso H, Automatic Annotation of Semantic Term Types in the Complete ACL Anthology Reference Corpus. LREC 2018
  4. Martínez Alonso H, Makki R, Gu J. CL-SciSumm Shared Task-Team Magma. BIRNDL@ SIGIR 2018
  5. Klerke S, Martínez Alonso H, Plank B. Grotoco@ SLAM: Second Language Acquisition Modeling with Simple Features, Learners and Task-wise Models. Workshop on Innovative Use of NLP for Building Educational Applications. 2018
  6. Sagot B and Martínez Alonso H. Improving neural tagging with lexical information. IWPT17
  7. Constant M and Martínez Alonso H. Benchmarking Joint Lexical and Syntactic Analysis on Multiword-Rich Data. Multiword Expression Workshop 2017.
  8. Martínez Alonso H, Delamaire A and Sagot B. Annotating omission in statement pairs. Linguistic Annotation Workshop 2017.
  9. Martínez Alonso H, Plank B. Multitask learning for semantic sequence prediction under varying data conditions. EACL 2017.
  10. Martínez Alonso H, Agić Ž, Plank B and Søgaard A. Parsing Universal Dependencies without training. EACL 2017.
  11. Martínez Alonso H, Sagot B and Seddah D. From Noisy Question to Minecraft Text: Annotation Challenges in Extreme Syntax Scenario. Workshop on Noisy User-generated Text 2016.
  12. Martínez Alonso H, Johannsen A and Plank B. Supersense tagging with inter-annotator disagreement. Linguistic Annotation Workshop 2016.
  13. Yimam S. M., Martínez Alonso H., Riedl M and Biemann C. Learning Paraphrasing for Multiword Expressions. Multiword Expression Workshop 2016.
  14. Schluter N and Martínez Alonso H. Approximate unsupervised summary optimisation for selections of ROUGE. TALN 2016 (vol. 4).
  15. Agić Ž, Plank B, Martínez Alonso H, Johannsen A, Schluter N and Søgaard A. Multilingual Projection for Parsing Truly Low-Resource Languages. TACL 2016 (vol 4).
  16. Schlichtkrull M S and Martínez Alonso H. MSejrKu at SemEval-2016 Task 14: Taxonomy Enrichment by Evidence Ranking SemEval 2016.
  17. Bingel J, Schluter N and Martínez Alonso H. CoastalCPH at SemEval-2016 Task 11 : The importance of designing your Neural Networks right SemEval 2016.
  18. Martínez Alonso H and Zeman D. Universal Dependencies for the AnCora treebanks.Procesamiento de Lenguaje Natural Journal (vol. 57) 2016.
  19. Pedersen BS, Braasch A, Johannsen A, Martínez Alonso H, Nimb S, Olsen S, Søgaard A and Sørensen NH. The SemDaX corpus – sense annotations with scalable sense inventories. LREC 2016.
  20. Martínez Alonso H, Johannsen A, Olsen S, Nimb S and Pedersen BS. An empirically grounded expansion of the supersense inventory. GWC 2016.
  21. Lia K, Martínez Alonso H. Cross-lingual part-of-speech tagging for Maltese. LRL 2015.
  22. Johannsen A, Martínez Alonso H and Søgaard A. Any-language frame-semantic parsing. EMNLP 2015.
  23. Martínez Alonso H, Johannsen A, Lopez de Lacalle O and Agirre E. Predicting word sense annotation agreement. LSDSem 2015.
  24. Plank B, Martínez Alonso H, Agić Ž, Merkler D and Søgaard A. Do dependency parsing metrics correlate with human judgments?. CONLL 2015
  25. Søgaard A, Agić Ž, Martínez Alonso H, Plank B, Bohnet B and Johannsen A. Inverted Indexing for Cross-lingual NLP. ACL 2015.
  26. Martínez Alonso H, Plank B, Skjærholt A and Søgaard A. Learning to parse with IAA-weighted loss. NAACL-HLT 2015.
  27. Plank B, Martínez Alonso H and Søgaard A. Non-canonical language is not harder to annotate than canonical language. The 9th Linguistic Annotation Workshop (NAACL-HLT 2015).
  28. McGillion S, Martínez Alonso H and Plank B. CPH: Sentiment analysis of Figurative Language on Twitter #easypeasy #not. Task 11 on SemEval2015 (NAACL-HLT 2015).
  29. Hovy D, Plank B, Martínez Alonso H, Søgaard A. Mining for unambiguous instances to adapt POS taggers to new domains. NAACL 2015.
  30. Søgaard A, Plank B, and Martinez Alonso H. Using Frame Semantics for Knowledge Extraction from Twitter. AAAI 2015.
  31. Parra Escartín C, and Martínez Alonso, H. Assessing WordNet for bilingual compound dictionary extraction. Workshop on Multi-word Units in Machine Translation Technology (MUMTTT2015).
  32. Parra Escartín C, and Martínez Alonso H. Choosing a Spanish Part-of-Speech tagger for a lexically sensitive task. Procesamiento del Lenguaje Natural 54 (2015).
  33. Martínez Alonso H, Johannsen A, Sussi O, Nimb S, Sørensen N H, Pedersen B S. Supersense tagging for Danish. NODALIDA 2015.
  34. Martínez Alonso H, Plank B, Johannsen A and Søgaard A. Active learning for sense annotation. NODALIDA 2015.
  35. Olsen S, Pedersen BS, Martínez Alonso H and Johannsen A. Coarse-Grained Sense Annotation of Danish across Textual Domains. Semantic resources and semantic annotation for Natural Language Processing and the Digital Humanities (NODALIDA 2015).
  36. Klerke S, Martínez Alonso H, and Søgaard A. Looking hard: Eye tracking for detecting grammaticality of automatically compressed sentences. NODALIDA 2015.
  37. Schluter N, Søgaard A, Elming J, Hovy D, Plank B, Martínez Alonso H, Johannsen A and Klerke S. Copenhagen-Malmö: Tree approximations of semantic parsing problems. Task 8 on SemEval-2014.
  38. Søgaard A, Johannsen A, Plank B, Hovy D, Martínez Alonso, H. What is in a p-value in NLP?. CoNLL 2014.
  39. Johannsen A, Hovy D, Martínez Alonso H, Plank B, Søgaard A. More or less supervised supersense tagging of Twitter. Third Joint Conference on Lexical and Computational Semantics, *SEM 2014 (Best Paper Award).
  40. Martínez Alonso H and Romeo L. Crowdsourcing as a preprocessing for complex semantic annotation tasks. LREC 2014.
  41. Søgaard A, Martínez Alonso H, Elming J and Johannsen A. Using crowdsourcing to get representations based on regular expressions. EMNLP 2013.
  42. Romeo L, Martínez Alonso H, Bel N. Class-based Word Sense Induction for dot-type nominals. The 6th International Conference on Generative Approaches to the Lexicon, 2013.
  43. Martínez Alonso H, Bel N and Pedersen BS. Annotation of regular polysemy and underspecification. ACL 2013.
  44. Padró M, Ballesteros M, Martínez Alonso H and Bohnet B. Finding dependency parsing limits over a large Spanish corpus. IJCNLP 2013.
  45. Elming J, Johannsen A, Klerke S, Lapponi E, Martínez Alonso, H and Søgaard, A. Down-stream effects of tree-to-dependency conversions. NAACL-HLT 2013.
  46. Martínez Alonso H, Bel N and Pedersen BS. A voting scheme to detect semantic underspecification, LREC 2012
  47. Johannsen A, Martínez Alonso H, Klerke S and Søgaard A. EMNLP@CPH: Is frequency all there is to simplicity?. Task 1 on SemEval2012.
  48. Martínez Alonso H, Bel N and Pedersen BS, Identification of sense selection in regular polysemy using shallow features. NODALIDA 2011.
  49. Johannsen A, Martínez Alonso H, Rishøj C and Søgaard A. Shared task system description: frustratingly hard compositionality prediction. DiSCo 2011, Workshop on Distributional Semantics and Compositionality (Best System) .
  50. Martínez Alonso H, Vivaldi J and Villegas M. Text handling as a Web Service for the IULA processing pipeline. Web Services and Processing Pipelines in HLT (LREC 2010).
  51. Martínez Alonso H, Villegas M, Bel N, Bel S and Alemany F. Lexicography in the grid environment, eLEX 2009.