Difference between revisions of "Asandle1 Week 13"

Revision as of 14:21, 16 April 2024

Harbison Paper

The document is called Pval by Gene which makes me think that we are looking at the Pvalues in the genes which tracks considering what the paper is based on.
The document has 6,231 rows which aligns with the gene numbers from the paper
The document has 206 columns, 203 are values. This makes me think the columns are for each transcriptional regulator.
There are some value boxes that just say NaN. Not sure how to deal with these
We have the gene names in column 1
We have the gene tags that humans can actually read in column 2
We have plain english descriptions of what everything is in column 3

Layout Assumptions: (Maybe incorrect)

Questions:

Data preprocessing?
- Can we worry about the NaN entries once everything has been added to Access or do we need to figure out how to remove those without removing every entry?
- How do we want to import by?
We want to be able to view by Gene or by environmental difference, does this mean making an access entry for the Genes and for the different experimental conditions?
How do we make sure the database for the Harbison paper also works with all the others? I think this probably has to do with the primary key which then means we can really only organize across the Gene ID’s because that is what will be common across other experiments.
Is there anything I am missing?

@@ Line 22: / Line 22: @@
 *How do we make sure the database for the Harbison paper also works with all the others? I think this probably has to do with the primary key which then means we can really only organize across the Gene ID’s because that is what will be common across other experiments.
 *Is there anything I am missing?
+===In Class Tuesday April 16th Notes===