@ani_nenkova Many studies on peer review analysis in NLP and beyond e.g. score calibration, bias, etc. https://t.co/Bhtp95oT5B Also, the problem is charging $60, and not the language quality score per se; nothing bad about language quality scores for peer review texts?
@annargrs@ZeerakTalat @ryandcotterell The main issue with consent retraction is dataset stability. For example, if I retract my text, what happens to derivative datasets, models, LMs? Published stats?
@ZeerakTalat@annargrs @ryandcotterell Yes, we were thinking about this; it's just a bit trickier to implement. For now we have NO (default), or yes to all reviews for a *given iteration*. So if for example I don't want to contribute one of my reviews, I just say NO to all of my reviews for this month.
@annargrs @ryandcotterell @ZeerakTalat Good point; retroactive application is definitely not a good idea. We might ask contributors again or leave the data be (or to be used by ACL). Since the process is continuous, the data will get replenished.
@ryandcotterell @annargrs@ZeerakTalat That's what we are discussing now. They way it's often done with protected data is that two institutions (e.g. universities) sign an agreement and then distribute NDA agreements to their members.
@annargrs @ryandcotterell @ZeerakTalat Agreed; as soon as we have the workflow for NDA (or whatever mechanism to protect confidentiality/personal data), the license text will need to be adapted.
@annargrs @ryandcotterell @ZeerakTalat Yes by NDA we mean "not sharing the data further and only using it for specific research purposes". It's important to keep track of access to the data; it's both personal data + confidentiality.
@ZeerakTalat@annargrs @ryandcotterell Based on our understanding, review scores (no titles, no identities) are not personal information. Review texts are, to the extent any other text is. There is a difference consent vs license. Happy to elaborate here: https://t.co/wPSINKdLrb
@ZeerakTalat @ryandcotterell @annargrs I think at this point it's really better to move to OpenReview, both for space and to keep track of what is discussed. Twitter is not a good tool for peer review: https://t.co/wPSINKdLrb
@ryandcotterell @annargrs@ZeerakTalat I do not understand the problem with PEER.
1) As per project page, PEER is annotation based. All we need is preprints. ArXiv has plenty.
2) there is textual data on peer review and papers working on that. It's not licensed and has not been collected in a systematic manner.
@annargrs @ryandcotterell Yes, we're actually writing a blog post now; also to have a TL;DR version of the preprint and have more people involved in the discussion. I'll be posting it here once it's online.
@annargrs@ZeerakTalat @ryandcotterell No, this is for metadata (scores, etc.). For texts we ask for license transfer, that's what the preprint describes.
@ZeerakTalat @ryandcotterell @annargrs Re (1): to clarify, your q is "why would we want a public dataset of peer review texts"; correct?
Re (2): that's the metadata consent -- not for texts. Texts are contributed later via license transfer to ACL (see preprint). We're happy to continue here: https://t.co/wPSINKvUFj
@annargrs@ZeerakTalat @ryandcotterell For a proper discussion, we have set up public commenting for the pre-print. Anyone with an OR account should be able to post (and outline their concerns/suggestions in more detail than Twitter allows): https://t.co/wPSINKdLrb
@annargrs @ryandcotterell Yes, absolutely!! But we really need a solution that most people agree on, very clear terms of data sharing, good mechanisms to guarantee protection, and a clear and open option to opt out. That's in progress :)
@ryandcotterell @annargrs Two points need clarification. First, the linked project is not about peer reviewing *automation*, it's about assistance and ML-supported analysis. Whether or not this is useful or the best thing to do is a different point -- guidelines and training are very important.
@annargrs Yes, we too! ARR is cool because it has less differences between iterations than regular conferences, so one can really measure the effect of interventions, both qualitatively and quantitatively. ETA ~next months. These things go through ARR EiCs, ACL exec, legal team, etc.