How much data should PIs check?

It sounds like a stupid question until you are balancing 4 different projects with 6 people generating data, but how much responsibility do PIs have for the data produced by their trainees and how deep does one have to go to ensure due diligence? Look at the raw data? The analyzed data? The analysis methods? Is there a difference between new and seasoned trainees?

In more than a few places (an example here), DrugMonkey and others advocate for new faculty to get away from the bench and focus on running the lab rather than generating the data. I don't have a problem with the advice, but inherent in that is the requirement that the trainees generating (and almost certainly analyzing) the data be trusted to do so if the most rigorous ways they know how to.

Obviously, the PI has the responsibility to train the people in their lab to perform these analyses, but how much double checking should be done? A case in which the result is clearly at odds with expectation is an obvious example for follow-up, but rarely are cases of data mis-analysis (or worse, data fraud) that actually reach the PI, cut and dry.

For the most part, it is the inadvertent mistake that can skew results without raising suspicions. Removal of "outliers" for "better fit" (which, BTW, can be very sketchy), sample contamination between two close-related species, slight mis-alignment in a phylogenetic alignment, etc., etc. The careful and experienced trainee may catch these, but any lab is often made up of trainees with a range of experience and aptitude for recognizing subtle problems. It is not uncommon for even triple-checked data to be later found to contain a significant error, which gets us back to the original question: How deep into the data should a PI go to make sure the end product represents the proper quality of data?

I don't know the answer and I suspect that it becomes a case by case issue, but for the PIs out there, how do you handle this? For the trainees, how much does your PI check through your work?

15 responses so far

  • DrugMonkey says:

    It *IS* one of the scariest parts of the job, trusting other people not to screw up, to notice when they do, to trouble shoot and evaluate. Remember that you actually do have a role as the person less closely tied to the work- you can most dispassionately evaluate, spot weird outcomes and ask questions than penetrate unthinking assumptions. That balances out the fear of trusting others a little bit...

  • A PI should check data - how it was generated and analyzed. However, it need not be painful.

    Lab meetings are where our data is discussed before is is even born. As a group we share approaches, problems, and preliminary data, and it is not just the PI checking our data. It takes a very experienced researcher to fake data AND defend it against a group like ours where original images and data is discussed way before it is made into a publishable figure.

    The more senior a PI gets, the less likely they will have performed a particular technique themselves. Here a PI relies on his postdocs and grad students to point out inconsistencies between technical challenges and "nice looking data". Encouraging students and postdocs to think like a PI is excellent training as well as directly beneficial to the group as a whole.

    A group brain is better than a single PI sat in his office.

  • Joseph says:

    My experience with this (both in physics and epidemiology) is that the best time to get suspicious is when the results seem too good to be true. Strong and interesting associations that have not previously been observed definitely concern me. I have never seen actual malice in data analysis but I have seen some innocent mistakes that made an effect seem much more important than it is (or even invented the effect).

  • Casey says:

    In the past, I've encountered two serious data integrity issues that could have been dealt with looking at raw data. One was involved a doctored Western blot and became immediately obvious once we had the original film. The other involved quantification of microscopy, and the researcher continually refused to show the actual images. So yes, look at raw data, even when you trust the person ... lab meeting is one very good format.

  • My PI looks at the raw westerns mainly because I don't have time to make a quick figure and he wants to see badass data ASAP. As far as qPCR goes, he is oblivious to the data handling, but I check my numbers against those from our core to make sure they are correct. Usually with assays, he just looks at graph and not the actual data points but I tend to bring them if he wants to anyways. If folks are shady they won't offer up the raw data. Honest scientists will show you a film from a western without a second thought.

  • proflikesubstance says:

    I am less concerned about out-right data faking than the small mis-steps that can lead to erroneous results.

  • chall says:

    I would say "in lab meetings or one-on-one meetings when data is discussed" ask questions. Just the ones like "what did you use to analyse this?", "remove any outliers?", "washing steps".... etc...

    Like when trouble shooting but there is no [obvious] trouble 😉

    I know some data that got messed with at one of my labs, due to a tech person making a fairly fundamental mistake that noone saw until they walked with them and made the same analysis step by step next to them so I've always been a bit weary trusting people I've never worked with who will give me results I depend on. (Yes, a bit paranoid).

    That was not intentional at all though, the person who know what they did was way more sneaky and refused to show photos of the results but always did show "normalised values in a histogram". Yes, I cringe when I see things like that still,... since it's very crunched data, imho. still though, it might be 'the only way to visualise somethings'...

  • I agree with PLS. I am much more concerned that my students will show me data that has been improperly processed and/or improperly measured than that they will fake outright. It has been very hard for me to trust the data I am fed. I go through how the experiments were done and the data analyzed verbally step by step, but I still worry.

  • Pascale says:

    This. If something looks too good to be true, well, you know the rest. And it's better to risk insulting a tech/student/postdoc than to withdraw papers and grant applications down the road.

  • Yael says:

    My PI always goes through the original data (westerns, including normalization controls, and insists on looking at duplicates/triplicates as well). Almost nothing from my group gets published without going through the lab meeting grinder. We also have joint lab meetings with another group, which means that someone else is looking at the data and criticizing it. In addition, grad students also do this with their committee members. I'd say that we do quite a lot of quality control here...

  • Namnezia says:

    We generate a ton of raw data and it would be impossible for me to look at all of it, but I do ask people to show me the spreadsheets containing the raw numbers once they have been extracted. Occasionally if something looks off, I'll ask the student to go over the analysis of one of his or her experiments to make sure it was done correctly.

  • Tigger says:

    This is an incredibly important topic but one that frequently gets short shrift. I don't have many of the answers but I like some of the suggestions above. On the first day in the lab and just about every day after that I stress to the peeps that getting true data comes first and foremost (after safety...). If you can't trust your data then you might as well not do it.

  • Hope says:

    Removal of “outliers” for “better fit” (which, BTW, can be very sketchy)….

    I remember the hubbub some time ago when Candid Engineer put up a post confessing to sketchy outlier removal in her younger days. It’s extremely important for a PI to be looking over the shoulders of his trainees because even very intelligent and capable students make really STUPID mistakes sometimes. And because they’re still learning, they’re going to step in it more often (than a PI), despite their best efforts.

    That said, my PI is extremely laid-back about this – maybe too much so. We talk about results, look at some of the numbers, and he asks a few questions about what I did, but he’s not exactly looking over my shoulder. I do have more research experience than the average grad student, though, so I suppose that and my sheer brilliance could have lulled him into a false sense of security. 😉

  • Gerty-Z says:

    This is a real concern. I'm worried not only about false positives, but also false negatives. I don't want to miss going down a really productive path because a poor experiment led us to try something different. DM is right, this IS the scariest part of the job.

  • David says:

    A major difference between academics and industry (say, the pharma industry) is that industry gets independent audits of "pivotal" data. In industry, we also know that FDA will scrutinize the work, at a level that academic research is almost never subjected to. After I transitioned from an academic setting to industry, I finally learned how to do it right.

Leave a Reply