Leveraging IA3 adapters for parameter-efficient logical deduction, Lightning AI
Investigates IA3 adapters for parameter efficient fine-tuning to enhance Llama-3’s logical deduction capabilities. (Apr. 2024)
Investigates IA3 adapters for parameter efficient fine-tuning to enhance Llama-3’s logical deduction capabilities. (Apr. 2024)
Motivates the formulation of the proximal policy optimization algorithm and applies it for reinforcement learning from human feedback (RLHF) to align Google’s Gemma with human conversational preferences. (Mar. 2024)
Explores the mathematics and code of discrete diffusion models, assessing their effectiveness, applicability, and limitations. (Jun 2022)
Demonstrates the effectiveness of active learning in text classification tasks by implementing a ratio- based sampling approach, improving performance and suitability for industrial applications. (Oct 2021)
Examines how complex-valued neural networks enhance representations by experimenting with traditional mappings and more efficient complex vector techniques. (Jun 2021)
Explores circularity and holomorphicity constraints in complex-valued neural networks, showcasing the effectiveness of linear and widely linear networks for image-denoising. (Oct 2021)
Showcases a non-autoregressive model for spectral inversion utilizing a feature-matching discriminator, highlighting fast inference on dynamic inputs. (Sep 2021)
Published in Knowledge Based Systems, 2023
This paper formalizes prior theoretical work on logical fallacies into a comprehensive three-stage evaluation framework of detection, coarse- grained, and fine-grained classification, and employs three families of robust and explainable methods based on prototype reasoning, instance-based reasoning, and knowledge injection.
Recommended citation: Sourati, Z., Venkatesh, V. P. P., Deshpande, D., Rawlani, H., Ilievski, F., Sandlin, H., & Mermoud, A. (2023). Robust and explainable identification of logical fallacies in natural language arguments. Knowledge-Based Systems, 266, 110418. https://doi.org/10.1016/j.knosys.2023.110418 https://www.sciencedirect.com/science/article/pii/S0950705123001685
Published in arXiv, 2023
A modular and comprehensive framework for studying Prototype Based Networks (PBNs), which includes different backbone architectures, backbone sizes, and objective functions is designed, which shows that the robustness of PBNs transfers to NLP classification tasks facing realistic perturbations.
Recommended citation: Sourati, Z., Deshpande, D., Ilievski, F., Gashteovski, K., & Saralajew, S. (2023). Robust Text Classification: Analyzing Prototype-Based Networks. ArXiv, abs/2311.06647. https://arxiv.org/abs/2311.06647
Published in NAACL-2024, 2024
This work proposes SPARK: a novel method for scoring argument quality based on contextualization via relevant knowledge, and devise four augmentations that leverage large language models to provide feedback, infer hidden assumptions, supply a similar-quality argument, or a counterargument.
Recommended citation: Darshan Deshpande, Zhivar Sourati, Filip Ilievski, and Fred Morstatter. 2024. Contextualizing Argument Quality Assessment with Relevant Knowledge. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers), pages 316–326, Mexico City, Mexico. Association for Computational Linguistics. https://aclanthology.org/2024.naacl-short.28/
Published in arXiv, 2024
Introduces GNOME, an automated framework that uses Large Language Models to generate synthetic open-domain negotiation dialogues from closed-domain datasets, addressing the limited generalizability of existing negotiation models. Experiments show that models trained on GNOME-generated data outperform state-of-the-art models in both domain-specific strategy prediction and generalization to novel domains, while reducing manual data curation efforts.
Recommended citation: Deshpande, D., Sinha, S., Kumar, A., Pal, D. & May, J. (2024). GNOME: Generating Negotiations through Open-Domain Mapping of Exchanges. ArXiv, abs/2406.10764. https://arxiv.org/abs/2305.12280
Published:
Discussing the hype surrounding diffusion models and exploring their effectiveness, applicability, and drawbacks. [slides]