Ilia Kuznetsov

@ilokuznetsov

Postdoc at @UKPLab @TUDarmstadt🌳 // peer review, intertextuality, linguistics, NLP applications // synthesizers 🎹

Darmstadt, Germany

Joined September 2009

137 Following

73 Followers

33 Posts

Ilia Kuznetsov @ilokuznetsov

over 3 years ago

@BertChakovsky marketing ☝️

Ilia Kuznetsov @ilokuznetsov

over 4 years ago

@ani_nenkova Many studies on peer review analysis in NLP and beyond e.g. score calibration, bias, etc. https://t.co/Bhtp95oT5B Also, the problem is charging $60, and not the language quality score per se; nothing bad about language quality scores for peer review texts?

Ilia Kuznetsov @ilokuznetsov

over 4 years ago

@annargrs @ZeerakTalat @ryandcotterell The main issue with consent retraction is dataset stability. For example, if I retract my text, what happens to derivative datasets, models, LMs? Published stats?

Ilia Kuznetsov @ilokuznetsov

over 4 years ago

@ZeerakTalat @annargrs @ryandcotterell Yes, we were thinking about this; it's just a bit trickier to implement. For now we have NO (default), or yes to all reviews for a *given iteration*. So if for example I don't want to contribute one of my reviews, I just say NO to all of my reviews for this month.

Who to follow

Nafise Sadat Moosavi

@NafiseSadat

Lecturer (~Assistant Prof.) in NLP @SheffieldNLP @shefcompsci, Muslim Iranian woman إنا على العهد

Haritz Puerto

@HaritzPuerto

Machine Learning & #NLProc Postdoc @ELLISInst_Tue and @MPI_IS Previously at @UKPLab and @kaistpr Teaching LLaMAs 🦙 how to think and follow instructions

Ji Ung Lee

@JiUngLee1

Postdoc@RTG Neuroexplicit Models, University of Saarland, Germany.

Ilia Kuznetsov @ilokuznetsov

over 4 years ago

@annargrs @ryandcotterell @ZeerakTalat Good point; retroactive application is definitely not a good idea. We might ask contributors again or leave the data be (or to be used by ACL). Since the process is continuous, the data will get replenished.

Ilia Kuznetsov @ilokuznetsov

over 4 years ago

@ryandcotterell @annargrs @ZeerakTalat That's what we are discussing now. They way it's often done with protected data is that two institutions (e.g. universities) sign an agreement and then distribute NDA agreements to their members.

Ilia Kuznetsov @ilokuznetsov

over 4 years ago

@annargrs @ryandcotterell @ZeerakTalat Agreed; as soon as we have the workflow for NDA (or whatever mechanism to protect confidentiality/personal data), the license text will need to be adapted.

Ilia Kuznetsov @ilokuznetsov

over 4 years ago

@annargrs @ryandcotterell @ZeerakTalat Yes by NDA we mean "not sharing the data further and only using it for specific research purposes". It's important to keep track of access to the data; it's both personal data + confidentiality.

Ilia Kuznetsov @ilokuznetsov

over 4 years ago

@ZeerakTalat @annargrs @ryandcotterell Based on our understanding, review scores (no titles, no identities) are not personal information. Review texts are, to the extent any other text is. There is a difference consent vs license. Happy to elaborate here: https://t.co/wPSINKdLrb

Ilia Kuznetsov @ilokuznetsov

over 4 years ago

@ZeerakTalat @ryandcotterell @annargrs I think at this point it's really better to move to OpenReview, both for space and to keep track of what is discussed. Twitter is not a good tool for peer review: https://t.co/wPSINKdLrb

Ilia Kuznetsov @ilokuznetsov

over 4 years ago

@ryandcotterell @annargrs @ZeerakTalat I do not understand the problem with PEER. 1) As per project page, PEER is annotation based. All we need is preprints. ArXiv has plenty. 2) there is textual data on peer review and papers working on that. It's not licensed and has not been collected in a systematic manner.

Ilia Kuznetsov @ilokuznetsov

over 4 years ago

@ZeerakTalat @annargrs @ryandcotterell Do you mean review scores, or do you mean review texts?

Ilia Kuznetsov @ilokuznetsov

over 4 years ago

@annargrs @ryandcotterell Yes, we're actually writing a blog post now; also to have a TL;DR version of the preprint and have more people involved in the discussion. I'll be posting it here once it's online.

Ilia Kuznetsov @ilokuznetsov

over 4 years ago

@annargrs @ZeerakTalat @ryandcotterell No, this is for metadata (scores, etc.). For texts we ask for license transfer, that's what the preprint describes.

Ilia Kuznetsov @ilokuznetsov

over 4 years ago

@ZeerakTalat @ryandcotterell @annargrs Re (1): to clarify, your q is "why would we want a public dataset of peer review texts"; correct? Re (2): that's the metadata consent -- not for texts. Texts are contributed later via license transfer to ACL (see preprint). We're happy to continue here: https://t.co/wPSINKvUFj

Ilia Kuznetsov @ilokuznetsov

over 4 years ago

@annargrs @ZeerakTalat @ryandcotterell For a proper discussion, we have set up public commenting for the pre-print. Anyone with an OR account should be able to post (and outline their concerns/suggestions in more detail than Twitter allows): https://t.co/wPSINKdLrb

Ilia Kuznetsov @ilokuznetsov

over 4 years ago

@annargrs @ryandcotterell Yes, absolutely!! But we really need a solution that most people agree on, very clear terms of data sharing, good mechanisms to guarantee protection, and a clear and open option to opt out. That's in progress :)

Ilia Kuznetsov @ilokuznetsov

over 4 years ago

@ryandcotterell @annargrs Two points need clarification. First, the linked project is not about peer reviewing *automation*, it's about assistance and ML-supported analysis. Whether or not this is useful or the best thing to do is a different point -- guidelines and training are very important.

Ilia Kuznetsov @ilokuznetsov

over 4 years ago

@annargrs Yes, we too! ARR is cool because it has less differences between iterations than regular conferences, so one can really measure the effect of interventions, both qualitatively and quantitatively. ETA ~next months. These things go through ARR EiCs, ACL exec, legal team, etc.

Ilia Kuznetsov

@ilokuznetsov

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users