edited on 2016.01.22
This document explains how raw fluorescence excitation-emission
matrix (EEM) data files can be imported into R as EEM
class
object. The raw data files here refer to text-based files (*.csv, *.txt)
which can be exported from the fluorescence spectrometer.
readEEM
function is available in the package for reading in
the raw data files. However, it should be noted that only specific formats are supported. Aside from the
file format, readEEM
only requires that the raw data were
divided into each files for each samples.
Newer equipment software does the unfolding of many samples into a data matrix where each row is each sample and each column is each wavelength condition. For this type of data matrix, a procedure is described here.
readEEM
readEEM
functionreadEEM
accepts path
argument specifies the
path to files or folders.
library(EEM)
# importing files
data <-readEEM("sample1.txt") # read in a file
data <-readEEM(c("sample1.txt", "sample2.txt")) # read in two files
# importing folders
data <-readEEM("/data") # read in all files in data folder
data <-readEEM(c("/data", "/data2")) # read in all files in two folders
data <-readEEM("C:\\data") # full path. Note that the slash is doubled.
data <- readEEM("C:/data") # read in all files in data folder. Aside from double slashes,
# a reverted slash can also be used.
If the working folder which contains raw data files has already been
set, readEEM(".")
or readEEM(getwd())
can be
used to access them directly.
readEEM
first checks the extensions of a file or all
files in the folder. The acceptable extensions are *.csv, *.txt, *.xls,
*.xlsx, and *.dat. It will not read in the files whose extensions are
not in the list. It will then check the formats of each files (see next
section on company-specific formats). The files whose format is not
supported by readEEM
will not be read in.
The method that readEEM
uses to check the file format is
to look out for a certain keyword which is located just above the raw
data. The formats of different companies are very similar except for the
number of filler lines at the top. readEEM
will look for a
keyword which signals the end of the filler lines and the start of raw
data.
Reference machine: FP-8500
The raw files are of “*.jwb” extension. They have to be converted into
text-based files (*.csv, *.txt) using the machine software first. The
keyword before the data starts is “XYDATA”. Columns are excitation
wavelength and rows are emission wavelength.
Reference machine: F-7000
The keyword before the data starts is “Data Points” or “データリスト”.Columns
are excitation wavelength and rows are emission wavelength.
Reference machine: RF-6000
The keyword before the data starts is “Rawdata” or “CorrectionData”.
Columns and rows are exchangable as excitation and emission wavelength.
We assume that rows are excitation wavelength and columns are emission
wavelength unless the word “励起波長/蛍光波長” is present. The exception
has only been added for the Japanese keyword since we only have the
sample data in Japanese.
Some recent machines can export unfolded data matrix directly. The
unfolded data matrix refer to a matrix whose columns represent the
wavelength conditions and rows represent the samples. If the columns and
rows are in reverse, it is always possible to transpose it using
t()
after importing into R. Those data matrix can be
imported into R as a data frame or matrix first before
fold
ing it into an EEM
class object.
datamatrix <- read.csv("datamatrix.csv", row.names = 1) # for csv
datamatrix <- read.csv("datamatrix.txt", row.names = 1) # for txt
datamatrix[1:5,1:5] # check the first 5 rows and columns
Note that row.names
was used to specify that the row
names exist and they are in column number 1.
EEM
objectBefore the folding operation, there is one condition that the data
matrix must meet. The column names must be of EX???EM???
format. The ???
represent the wavelength value in nm. The
usable examples include EX200EM230
and even with decimal
point inside EX200.5EM400.5
.
readEEM