Create Files

In this notebook, we’ll create some larcv files, store them to disk, and reload them to validate them.

import larcv
import numpy

Step 1: Create an output file

We use larcv’s IOManager to handle all file IO:

# What should the name of the file be?
output = "demo_output.h5"

# Create an output larcv file, using the WRITE mode
io_manager = larcv.IOManager(larcv.IOManager.kWRITE)
io_manager.set_out_file(str(output))
io_manager.initialize()
True

Step 2: Creating data products

Creating Data products in python is not too challenging. In this step, we create a number of objects demonstrating most of the larcv datatypes. In Step 3, we’ll store them to file.

# Dense Tensor Objects

def create_tensor1d(n_projection_ids, dense_shape=[512,]):
    # Create an return a list of Tensor2D objects with dense shape as defined
    tensor_1ds = []
    for i in range(n_projection_ids):
        data = numpy.random.uniform(size=dense_shape).astype("float32")
        tensor = larcv.Tensor1D(data)
        # Creating from numpy automatically sets the projection ID to 0, so fix that:
        tensor.set_projection_id(i)
        tensor_1ds.append(tensor)
    return tensor_1ds

def create_tensor2d(n_projection_ids, dense_shape=[512, 512]):
    # Create an return a list of Tensor2D objects with dense shape as defined
    tensor_2ds = []
    for i in range(n_projection_ids):
        data = numpy.random.uniform(size=dense_shape).astype("float32")
        tensor = larcv.Tensor2D(data)
        # Creating from numpy automatically sets the projection ID to 0, so fix that:
        tensor.set_projection_id(i)
        tensor_2ds.append(tensor)
    return tensor_2ds

def create_tensor3d(n_projection_ids, dense_shape=[128, 128, 128]):
    # Create an return a list of Tensor2D objects with dense shape as defined
    tensor_3ds = []
    for i in range(n_projection_ids):
        data = numpy.random.uniform(size=dense_shape).astype("float32")
        tensor = larcv.Tensor3D(data)
        # Creating from numpy automatically sets the projection ID to 0, so fix that:
        tensor.set_projection_id(i)
        tensor_3ds.append(tensor)    
    return tensor_3ds

def create_tensor4d(n_projection_ids, dense_shape=[64, 64, 64, 64]):
    # Create an return a list of Tensor2D objects with dense shape as defined
    tensor_4ds = []
    for i in range(n_projection_ids):
        data = numpy.random.uniform(size=dense_shape).astype("float32")
        tensor = larcv.Tensor4D(data)
        # Creating from numpy automatically sets the projection ID to 0, so fix that:
        tensor.set_projection_id(i)
        tensor_4ds.append(tensor)
    return tensor_4ds

All of these functions are just creating random data - don’t read into it too much!

Step 3 - Write Data to file

Typically, data is organized into “events” that have a run/subrun/event ID, and each event can contain multiple types of dataproducts and multiple projection IDs per product. Writing a file is usually a loop over events from some sort of source.

for i in range(10): # Let's make this over 10 events
    # Set the event identifiers.  It's not mandatory, but probably convienient:
    io_manager.set_id(1,2,i) # Run 1, subrun 2, event i
    
    # To create data, we get the "event_dataproduct" object from the io manager:
    event_tensor2d = io_manager.get_data("tensor2d", "demo_data") # "Demo_data" here is a string identifier that is unique to this dataproduct.
    
    tensor_2d_list = create_tensor2d(3)
    
    # Write the tensors to the event_tensor object:
    for i, t in enumerate(tensor_2d_list):
        t.meta().set_projection_id(i)
        event_tensor2d.append(t)
    
    # The data doesn't go to disc until you call save entry:
    io_manager.save_entry()
    
# How many events did we save, total?
print(io_manager.get_n_entries())
10

Finally!

Close the file:

io_manager.finalize()
    [NORMAL]  <IOManager::finalize> Closing output file