The HiCData class

A class for handling HiC read data.

class hifive.hic_data.HiCData(filename, mode='r', silent=False)

This class handles interaction count data for HiC experiments.

This class stores mapped paired-end reads, indexing them by fragment-end (fend) number, in an h5dict.

Note

This class is also available as hifive.HiCData

When initialized, this class creates an h5dict in which to store all data associated with this object.

Parameters:
  • filename (str.) – The file name of the h5dict. This should end with the suffix ‘.hdf5’
  • mode (str.) – The mode to open the h5dict with. This should be ‘w’ for creating or overwriting an h5dict with name given in filename.
  • silent (bool.) – Indicates whether to print information about function execution for this object.
Returns:

HiCData class object.

export_to_mat(outfilename)

Write reads loaded in data object to text file in HiCPipe-compatible ‘mat’ format.

Parameters:outfilename (str.) – Specifies the file to save data in.
Returns:None
load()

Load data from h5dict specified at object creation.

Any call of this function will overwrite current object data with values from the last save() call.

Returns:None
load_data_from_bam(fendfilename, filelist, maxinsert)

Read interaction counts from pairs of BAM-formatted alignment file(s) and place in h5dict.

Parameters:
  • fendfilename (str.) – This specifies the file name of the Fend object to associate with the dataset.
  • filelist (list of mapped sequencing runs. Each run should be a list of the first and second read end bam files ([[run1_1, run1_2], [run2_1, run2_2]...])) – A list containing lists of paired end bam files. If only one pair of files is needed, the list may contain both file path strings.
  • maxinsert (int.) – A cutoff for filtering paired end reads whose total distance to their respective restriction sites exceeds this value.
Returns:

None

load_data_from_mat(fendfilename, filename, maxinsert=0)

Read interaction counts from a HiCPipe-compatible ‘mat’ text file and place in h5dict.

Parameters:
  • fendfilename (str.) – This specifies the file name of the Fend object to associate with the dataset.
  • filename (str.) – File name of a ‘mat’ file containing fend pair and interaction count data.
  • maxinsert (int.) – A cutoff for filtering paired end reads whose total distance to their respective restriction sites exceeds this value.
Returns:

None

load_data_from_raw(fendfilename, filelist, maxinsert)

Read interaction counts from a text file(s) and place in h5dict.

Files should contain both mapped ends of a read, one read per line, separated by tabs. Each line should be in the following format:

chromosome1    coordinate1  strand1   chromosome2    coordinate2  strand2

where strands are given by the characters ‘+’ and ‘-‘.

Parameters:
  • fendfilename (str.) – This specifies the file name of the Fend object to associate with the dataset.
  • filelist (list) – A list containing all of the file names of mapped read text files to be included in the dataset. If only one file is needed, this may be passed as a string.
  • maxinsert (int.) – A cutoff for filtering paired end reads whose total distance to their respective restriction sites exceeds this value.
Returns:

None

save()

Save analysis parameters to h5dict.

Returns:None