Skip to content

Complete API Reference

This page contains the complete API documentation for all python modules.

crisscross Module

Core Functions

Megastructures

crisscross.core_functions.megastructures

Megastructure

Megastructure(
    slat_array=None,
    slat_coordinate_dict=None,
    layer_interface_orientations=None,
    connection_angle="90",
    slat_type_dict=None,
    import_design_file=None,
)

Convenience class that bundles the entire details of a megastructure including slat positions, seed handles and cargo.

PARAMETER DESCRIPTION
slat_array

Array of slat positions (3D - X,Y, layer ID) containing the positions of all slats in the design.

DEFAULT: None

slat_coordinate_dict

Dictionary of slat coordinates (key = (layer, slat ID), value = list of (y,x) coordinates). This is optional, but required to properly adjust positions of double-barrel slats.

DEFAULT: None

layer_interface_orientations

The direction each slat will be facing in the design. E.g. for a 2 layer design, [2, 5, 2] implies that the bottom layer will have H2 handles sticking out, the connecting interface will have H5 handles and the top layer will have H2 handles again.

DEFAULT: None

connection_angle

The angle at which the slats will be connected. For now, only 90 and 60 grids are supported.

DEFAULT: '90'

slat_type_dict

Dictionary of slat types (key = (layer, slat ID), value = slat type string)

DEFAULT: None

import_design_file

If provided, the design will be imported from the specified file instead of being built from scratch.

DEFAULT: None

direct_handle_assign

direct_handle_assign(
    slat,
    slat_position_index,
    slat_side,
    handle_val,
    category,
    descriptor,
    sel_plates=None,
    update=False,
    suppress_warnings=False,
)

Assigns a handle to a slat's specific side/position.

PARAMETER DESCRIPTION
slat

Slat object to attach to.

slat_position_index

Position along slat (1-indexed).

slat_side

2 or 5.

handle_val

Actual payload handle value.

category

SEED, CARGO, ASSEMBLY_HANDLE OR ASSEMBLY_ANTIHANDLE

descriptor

Long-form description of handle contents

sel_plates

Plates from which to extract sequence data

DEFAULT: None

update

Only update a handle's value instead of setting a new one.

DEFAULT: False

suppress_warnings

Set to true to suppress warnings when overwriting existing handles.

DEFAULT: False

smart_set_slat_handle

smart_set_slat_handle(
    slat_key,
    position,
    side,
    handle_val,
    category,
    descriptor,
    sel_plates=None,
    suppress_warnings=False,
    coordinate_to_slats=None,
)

Sets a handle on a slat and propagates it through phantom slats and linked handles according to the design rules.

PARAMETER DESCRIPTION
slat_key

Slat ID string (e.g., 'layer1-slat5')

position

Position index on the slat (1-based)

side

Handle side (2 or 5)

handle_val

Handle value/ID

category

Handle category ('ASSEMBLY_HANDLE', 'ASSEMBLY_ANTIHANDLE', 'CARGO', 'SEED', etc.)

descriptor

Longer-form description of handle contents

sel_plates

Optional plate object to extract handle sequences from

DEFAULT: None

suppress_warnings

If True, suppresses overwrite warnings

DEFAULT: False

coordinate_to_slats

Dictionary mapping (y, x, layer) to a list of (slat_key, position) at that coordinate (if not provided will be computed on-the-fly).

DEFAULT: None

RETURNS DESCRIPTION

N/A

smart_handle_delete

smart_handle_delete(slat_key, slat_position, slat_side, coordinate_to_slats=None)

Deletes a handle from a slat and propagates it through phantom slats and linked handles according to standard design rules.

PARAMETER DESCRIPTION
slat_key

The target slat ID. Cannot be a phantom slat.

slat_position

Position index on the slat (1-based).

slat_side

2 or 5

coordinate_to_slats

Can pre-compute slat lookup map if desired.

DEFAULT: None

enforce_phantom_links_on_assembly_handle_array(handle_array, modify_in_place=True)

Synchronizes handle values across all linked positions in an assembly handle array.

This method ensures that: 1. Phantom slats share handle values with their parent reference slats 2. Physically connected slats at layer interfaces have matching handle/antihandle pairs 3. Explicitly linked handles (via link_manager) share the same values 4. Enforced values (zeros or specific handle IDs) are applied where required

IMPORTANT: Call this after random handle generation but before evolution scoring, to ensure the handle array respects all linking constraints.

Algorithm: - Iterates through every non-zero position in the 3D handle array - For each position, triggers _recursive_propagate_handle to propagate the value through all linked handles (phantoms, physical attachments, explicit links) - The recursion handles phantom networks, physical attachments, and explicit links

PARAMETER DESCRIPTION
handle_array

3D numpy array of handle values with shape (y, x, layer_interface)

modify_in_place

If True, modifies the input array; if False, creates a copy

DEFAULT: True

RETURNS DESCRIPTION

Modified handle array with all links enforced

assign_assembly_handles

assign_assembly_handles(
    handle_arrays,
    crisscross_handle_plates=None,
    crisscross_antihandle_plates=None,
    suppress_warnings=False,
)

Assigns crisscross handles to the slats based on the handle arrays provided. V.IMP - NO PHANTOM HANDLE VALIDATION OCCURS HERE, MAKE SURE TO DO THIS BEFORE RUNNING THIS FUNCTION.

PARAMETER DESCRIPTION
handle_arrays

3D array of handle values (X, Y, layer) where each value corresponds to a handle ID.

crisscross_handle_plates

Crisscross handle plates. If not supplied, a placeholder will be added to the slat instead.

DEFAULT: None

crisscross_antihandle_plates

Crisscross anti-handle plates. If not supplied, a placeholder will be added to the slat instead. TODO: this function assumes the pattern is always handle -> antihandle -> handle -> antihandle etc. Can we make this customizable?

DEFAULT: None

RETURNS DESCRIPTION

N/A

assign_seed_handles

assign_seed_handles(seed_dict, seed_plate=None)

Assigns seed handles to the slats based on the seed dictionary provided.

generate_slat_occupancy_grid

generate_slat_occupancy_grid(use_original_slat_array=False, category='original_slats')

Generates a 3D occupancy grid of the slats in the design.

PARAMETER DESCRIPTION
use_original_slat_array

If True, uses the original slat array instead of the current slat state.

DEFAULT: False

category

'original_slats', 'phantom_slats' or 'all_slats' - for phantoms, the phantom parent IDs are used.

DEFAULT: 'original_slats'

RETURNS DESCRIPTION

3D numpy array with slat IDs at each position (X, Y, layer)

generate_assembly_handle_grid

generate_assembly_handle_grid(category='original_slats', create_mutation_mask=False)

Generates a 3D occupancy grid of the assembly handles in the design.

PARAMETER DESCRIPTION
category

'original_slats', 'phantom_slats' or 'all_slats'

DEFAULT: 'original_slats'

create_mutation_mask

If True, creates an integer mask indicating positions that can be mutated. Areas with linked handles are given the same integers. These integers have nothing to do with handle IDs.

DEFAULT: False

RETURNS DESCRIPTION

3D numpy array with handle IDs at each position (X, Y, layer)

assign_cargo_handles_with_dict

assign_cargo_handles_with_dict(cargo_dict, cargo_plate=None)

Assigns cargo handles to the megastructure slats based on the cargo dictionary provided.

PARAMETER DESCRIPTION
cargo_dict

Dictionary of cargo placements (key = slat position, layer, handle orientation, value = cargo ID)

cargo_plate

The cargo plate from which cargo will be assigned. If not provided, a placeholder will be assigned instead.

DEFAULT: None

RETURNS DESCRIPTION

N/A

convert_cargo_array_into_cargo_dict

convert_cargo_array_into_cargo_dict(
    cargo_array, cargo_keymap, layer, handle_orientation=None
)

Converts a cargo array into a dictionary that can be used to assign cargo handles to the slats.

PARAMETER DESCRIPTION
cargo_array

Numpy array with cargo IDs (and 0s where no cargo is present).

cargo_keymap

A dictionary converting cargo ID numbers into unique strings.

layer

The layer the cargo should be assigned to (either top, bottom or a specific number)

handle_orientation

The specific slat handle orientation to which the cargo is assigned.

DEFAULT: None

RETURNS DESCRIPTION

Dictionary of converted cargo.

assign_cargo_handles_with_array

assign_cargo_handles_with_array(
    cargo_array, cargo_key, cargo_plate=None, layer="top", handle_orientation=None
)

Assigns cargo handles to the megastructure slats based on the cargo array provided.

PARAMETER DESCRIPTION
cargo_array

2D array containing cargo IDs (must match plate provided).

cargo_key

Dictionary mapping cargo IDs to cargo unique names for proper identification.

cargo_plate

Plate class with sequences to draw from. If not provided, a placeholder will be assigned instead.

DEFAULT: None

layer

Either 'top' or 'bottom', or the exact layer ID required.

DEFAULT: 'top'

handle_orientation

If a middle layer is specified, then the handle orientation must be provided since there are always two options available.

DEFAULT: None

patch_placeholder_handles

patch_placeholder_handles(plates)

Patches placeholder handles with actual handles based on the plates provided.

PARAMETER DESCRIPTION
plates

List of plates from which to extract handles.

RETURNS DESCRIPTION

N/A

patch_flat_staples

patch_flat_staples(flat_plate)

Fills up all remaining holes in slats with no-handle control sequences.

PARAMETER DESCRIPTION
flat_plate

Plate class with flat sequences to draw from.

get_slats_by_assembly_stage

get_slats_by_assembly_stage(minimum_handle_cutoff=16)

Runs through the design and separates out all slats into groups sorted on their predicted assembly stage.

PARAMETER DESCRIPTION
minimum_handle_cutoff

Minimum number of handles that need to be present for a slat to be considered stably setup.

DEFAULT: 16

RETURNS DESCRIPTION

Dict of slats (key = slat ID, value = assembly order)

get_slat_match_counts

get_slat_match_counts(use_external_handle_array=None)

Runs through the design and counts how many slats have a certain number of connections (matches) to other slats. Useful for computing the hamming distance of a design with variable slat types.

RETURNS DESCRIPTION

Dictionary of match counts (key = number of matches, value = number of slat pairs with that many matches), and a connection graph that lists all slat pairs with a certain number of matches. TODO: what to do in the case of slats with multiple layers e.g. the sierpinski slats? TODO: as with other functions, this function also assumes handle -> antihandle -> handle -> antihandle pattern.

get_bag_of_slat_handles

get_bag_of_slat_handles(
    use_original_slat_array=False,
    use_external_handle_array=None,
    remove_blank_slats=False,
)

Obtains two dictionaries - both containing the arrays corresponding to the handle positions of each slat in the megastructure (one for handles and one for antihandles). The keys correspond to the slat ID while the values contain individual numpy handle arrays, one for each slat. The handle arrays represent the actual 2D shape of the slat in the design. Non-2D 60 degree slats are converted into 2D triangular coordinates to make handle match computation significantly simpler.

get_parasitic_interactions

get_parasitic_interactions()

Computes the match strength score of the megastructure design based on the assembly handles present. 4 items are provided in a dictionary: - the worst match score (i.e. the maximum number of matches between any two slats in the design) - the mean log score (i.e. the log of the mean number of matches between all slat pairs in the design) - the similarity score (i.e. the maximum number of matches between slats of the same type, which could result in a slat taking the place of another slat in the design if strong enough) - the full match histogram, which could be helpful for investigating designs with unexpected scores.

create_graphical_slat_view

create_graphical_slat_view(
    save_to_folder=None,
    instant_view=True,
    include_cargo=True,
    include_seed=True,
    filename_prepend="",
    colormap=None,
    cargo_colormap=None,
)

Creates a graphical view of the slats, cargo and seeds in the design. Refer to the graphics module for more details.

PARAMETER DESCRIPTION
save_to_folder

Set to the filepath of a folder where all figures will be saved.

DEFAULT: None

instant_view

Set to True to plot the figures immediately to your active view.

DEFAULT: True

include_cargo

Set to True to include cargo in the graphical view.

DEFAULT: True

include_seed

Set to True to include the seed in the graphical view.

DEFAULT: True

filename_prepend

String to prepend to the filename of generated figures.

DEFAULT: ''

colormap

The colormap to sample from for each additional layer.

DEFAULT: None

cargo_colormap

The colormap to sample from for each cargo type.

DEFAULT: None

RETURNS DESCRIPTION

N/A

create_graphical_assembly_handle_view

create_graphical_assembly_handle_view(
    save_to_folder=None, instant_view=True, filename_prepend="", colormap=None
)

Creates a graphical view of the assembly handles in the design. Refer to the graphics module for more details.

PARAMETER DESCRIPTION
save_to_folder

Set to the filepath of a folder where all figures will be saved.

DEFAULT: None

instant_view

Set to True to plot the figures immediately to your active view.

DEFAULT: True

filename_prepend

String to prepend to the filename of generated figures.

DEFAULT: ''

colormap

The colormap to sample from for each additional layer.

DEFAULT: None

RETURNS DESCRIPTION

N/A

create_graphical_3D_view

create_graphical_3D_view(
    save_folder,
    window_size=(2048, 2048),
    filename_prepend="",
    colormap=None,
    cargo_colormap=None,
)

Creates a 3D video of the megastructure slat design.

PARAMETER DESCRIPTION
save_folder

Folder to save all video to.

window_size

Resolution of video generated. 2048x2048 seems reasonable in most cases.

DEFAULT: (2048, 2048)

filename_prepend

String to prepend to the filename of the video.

DEFAULT: ''

colormap

Colormap to extract layer colors from

DEFAULT: None

cargo_colormap

Colormap to extract cargo colors from

DEFAULT: None

RETURNS DESCRIPTION

N/A

create_blender_3D_view

create_blender_3D_view(
    save_folder,
    animate_assembly=False,
    animation_type="translate",
    custom_assembly_groups=None,
    slat_translate_dict=None,
    minimum_slat_cutoff=15,
    camera_spin=False,
    correct_slat_entrance_direction=True,
    force_slat_color_by_layer=True,
    colormap=None,
    cargo_colormap=None,
    slat_flip_list=None,
    include_bottom_light=False,
    filename_prepend="",
)

Creates a 3D model of the megastructure slat design as a Blender file.

PARAMETER DESCRIPTION
save_folder

Folder to save all video to.

animate_assembly

Set to true to also generate an animation of the design being assembled group by group.

DEFAULT: False

animation_type

Type of animation to generate. Options are 'translate' and 'wipe_in'.

DEFAULT: 'translate'

custom_assembly_groups

If set, will use the specific provided dictionary to assign slats to the animation order.

DEFAULT: None

slat_translate_dict

If set, will use the specific provided dictionary to assign specific animation translation distances to each slat.

DEFAULT: None

minimum_slat_cutoff

Minimum number of slats that need to be present for a slat to be considered stable. You might want to vary this number as certain designs have staggers that don't allow for a perfect 16-slat binding system.

DEFAULT: 15

camera_spin

Set to true to have camera spin around object during the animation.

DEFAULT: False

correct_slat_entrance_direction

If set to true, will attempt to correct the slat entrance animation to always start from a place that is supported.

DEFAULT: True

force_slat_color_by_layer

If set to true, will force the slat color to be the same as the layer color, regardless of the animation groups.

DEFAULT: True

colormap

Colormap to extract layer colors from

DEFAULT: None

cargo_colormap

Colormap to extract cargo colors from

DEFAULT: None

slat_flip_list

List of slat IDs - if a slat is in this list, its animation direction will be flipped. This cannot be used in conjunction with the correct_slat_entrance_direction parameter.

DEFAULT: None

include_bottom_light

Set to true to add a light source at the bottom of the design.

DEFAULT: False

filename_prepend

String to prepend to the filename of the Blender file.

DEFAULT: ''

RETURNS DESCRIPTION

N/A

create_standard_graphical_report

create_standard_graphical_report(
    output_folder,
    generate_3d_video=True,
    colormap=None,
    cargo_colormap=None,
    filename_prepend="",
)

Generates entire set of graphical reports for the megastructure design.

PARAMETER DESCRIPTION
output_folder

Output folder to save all images to.

generate_3d_video

If set to true, will generate a 3D video of the design.

DEFAULT: True

colormap

Colormap to extract layer colors from.

DEFAULT: None

cargo_colormap

Colormap to extract cargo colors from.

DEFAULT: None

filename_prepend

String to prepend to the filename of generated figures.

DEFAULT: ''

RETURNS DESCRIPTION

N/A

export_design

export_design(filename, folder)

Exports the entire design to a single excel file. All individual slat, cargo, handle and seed arrays are exported into separate sheets.

PARAMETER DESCRIPTION
filename

Output .xlsx filename

folder

Output folder

RETURNS DESCRIPTION

N/A

import_design

import_design(file)

Reads in a complete megastructure from an excel file formatted with each array separated in a different sheet.

PARAMETER DESCRIPTION
file

Path to Excel file containing megastructure design

RETURNS DESCRIPTION

All arrays and metadata necessary to regenerate the design

Megastructure Composition

crisscross.core_functions.megastructure_composition

visualize_output_plates

visualize_output_plates(
    output_well_descriptor_dict,
    plate_size,
    save_folder,
    save_file,
    slat_display_format="pie",
    plate_display_aspect_ratio=1.495,
)

Prepares a visualization of the output plates for the user to be able to verify the design (or print out and use in the lab).

PARAMETER DESCRIPTION
output_well_descriptor_dict

Dictionary where key = (plate_number, well), and the value is a list with: [slat_name, [category for all 32 handles on H2 side e.g. 0 2 3 4 0], [same for H5 side], slat_color, slat_type], where the categories are: 0 = control, 1 = assembly, 2 = seed, 3 = cargo, 4 = undefined

plate_size

Either 96 or 384

save_folder

Output folder

save_file

Output filename

plate_display_aspect_ratio

Aspect ratio to use for figure display - default matches true plate dimensions

DEFAULT: 1.495

slat_display_format

Set to 'pie' to output an occupancy pie chart for each well, 'barcode' to output a barcode showing the category of each individual handle, or 'stacked_barcode' for a more detailed view

DEFAULT: 'pie'

RETURNS DESCRIPTION

N/A

convert_slats_into_echo_commands

convert_slats_into_echo_commands(
    slat_dict,
    destination_plate_name,
    output_folder,
    output_filename,
    reference_transfer_volume_nl=75,
    reference_concentration_uM=500,
    transfer_volume_multiplier_for_slats=None,
    source_plate_type="384PP_AQ_BP",
    output_empty_wells=False,
    manual_plate_well_assignments=None,
    unique_transfer_volume_for_plates=None,
    output_plate_size="96",
    center_only_well_pattern=False,
    generate_plate_visualization=True,
    plate_viz_type="stacked_barcode",
    destination_well_max_volume=25,
    normalize_volumes=False,
    color_output_wells=True,
)

Converts a dictionary of slats into an echo liquid handler command list for all handles provided.

PARAMETER DESCRIPTION
slat_dict

Dictionary of slat objects

destination_plate_name

The name of the design's destination output plate

output_folder

The output folder to save the file to

output_filename

The name of the output file

reference_transfer_volume_nl

The default transfer volume for each handle (in nL)

DEFAULT: 75

reference_concentration_uM

The default concentration matching that of the selected transfer volume (in uM)

DEFAULT: 500

transfer_volume_multiplier_for_slats

Dictionary assigning a special transfer multiplier for specified slats (will be applied to all special plate volumes too)

DEFAULT: None

source_plate_type

The physical plate type in use

DEFAULT: '384PP_AQ_BP'

output_empty_wells

Outputs an empty row for a well if a handle is only a placeholder

DEFAULT: False

manual_plate_well_assignments

The specific output wells to use for each slat (if not provided, the script will automatically assign wells). This can be either a list of tuples (plate number, well name) or a dictionary of tuples.

DEFAULT: None

unique_transfer_volume_for_plates

Dictionary assigning a special transfer volume for certain plates (supersedes all other settings)

DEFAULT: None

output_plate_size

Either '96' or '384' for the output plate size

DEFAULT: '96'

center_only_well_pattern

Set to true to force output wells to be in the center of the plate. This is only available for 96-well plates.

DEFAULT: False

generate_plate_visualization

Set to true to generate a graphic showing the positions and contents of each well in the output plates

DEFAULT: True

plate_viz_type

Set to 'barcode' to show a barcode of the handle types in each well, 'pie' to show a pie chart of the handle types or 'stacked_barcode' to show a more in-detail view

DEFAULT: 'stacked_barcode'

destination_well_max_volume

The maximum total volume that can be transferred to a well in the output plate (in uL)

DEFAULT: 25

normalize_volumes

Set to True to normalize the volumes in each slat mixture (by adding water to the maximum volume)

DEFAULT: False

color_output_wells

Set to True to add a color border to each well according to the slat's layer color or unique color (if available)

DEFAULT: True

RETURNS DESCRIPTION

Pandas dataframe corresponding to output ech handler command list

Slats

crisscross.core_functions.slats

Slat

Slat(
    ID,
    layer,
    slat_coordinates,
    non_assembly_slat=False,
    unique_color=None,
    layer_color=None,
    slat_type="tube",
    phantom_parent=None,
)

Wrapper class to hold all of a slat's handles and related details.

PARAMETER DESCRIPTION
ID

Slat unique ID (string)

layer

Layer position for slat (normally 1 and above, but can set to 0 for special slats such as crossbars)

slat_coordinates

Exact positions slat occupies on a 2D grid - either a list of tuples or a dict of lists, where the key is the handle number on the slat

non_assembly_slat

If True, this slat is not used for assembly (e.g., crossbar slats)

DEFAULT: False

unique_color

Optional hexcode color to assign a unique color to the slat for graphics

DEFAULT: None

phantom_parent

If this slat is a linked copy of another slat (and thus a 'fake' slat used for handle linking purposes), then set the parent slat here.

DEFAULT: None

reverse_direction

reverse_direction()

Reverses the handle order on the slat (this should not affect the design placement of the slat).

get_sorted_handles

get_sorted_handles(side='h2')

Returns a sorted list of all handles on the slat (as they can be jumbled up sometimes, depending on the order they were created).

PARAMETER DESCRIPTION
side

h2 or h5

DEFAULT: 'h2'

RETURNS DESCRIPTION

tuple of handle ID and handle dict contents

set_placeholder_handle

set_placeholder_handle(
    handle_id, slat_side, category, value, descriptor, suppress_warnings=False
)

Assigns a placeholder to the slat, instead of a full handle.

PARAMETER DESCRIPTION
handle_id

Handle position on slat

slat_side

H2 or H5

descriptor

Description to use for placeholder

RETURNS DESCRIPTION

N/A

remove_handle

remove_handle(handle_id, slat_side)

Removes a handle from the slat.

PARAMETER DESCRIPTION
handle_id

Handle position on slat

slat_side

H2 or H5

RETURNS DESCRIPTION

N/A

update_placeholder_handle

update_placeholder_handle(
    handle_id,
    slat_side,
    sequence,
    well,
    plate_name,
    category,
    value,
    concentration,
    descriptor="No Desc.",
)

Updates a placeholder handle with the actual handle.

PARAMETER DESCRIPTION
handle_id

Handle position on slat

slat_side

H2 or H5

sequence

Exact handle sequence

well

Exact plate well

plate_name

Exact plate name

descriptor

Exact description of handle

DEFAULT: 'No Desc.'

RETURNS DESCRIPTION

N/A

set_handle

set_handle(
    handle_id,
    slat_side,
    sequence,
    well,
    plate_name,
    category,
    value,
    concentration,
    descriptor="No Desc.",
)

Defines the full details of a handle on a slat.

PARAMETER DESCRIPTION
handle_id

Handle position on slat

slat_side

H2 or H5

sequence

Exact handle sequence

well

Exact plate well

plate_name

Exact plate name

descriptor

Exact description of handle

DEFAULT: 'No Desc.'

RETURNS DESCRIPTION

N/A

get_molecular_weight

get_molecular_weight()

Calculates the molecular weight of the slat, based on the handles assigned.

RETURNS DESCRIPTION

Molecular weight of the slat (in Da)

get_slat_key

get_slat_key(layer, slat_id, phantom_id=None)

Convenience function to generate slat key string.

convert_slat_array_into_slat_objects

convert_slat_array_into_slat_objects(slat_array)

Converts a slat array into a dictionary of slat objects for easy access.

PARAMETER DESCRIPTION
slat_array

3D numpy array of slats - each point should either be a 0 (no slat) or a unique ID (slat here)

RETURNS DESCRIPTION

Dictionary of slats

HandleLinkManager(handle_link_df=None)

Manages handle linking information for megastructures.

This class tracks three types of constraints on handle values:

  1. Linked Groups: Handles that must share the same value. When one handle in a group changes, all others must change to match.
  2. Stored in: handle_link_to_group (key → group_id) and handle_group_to_link (group_id → list of keys)

  3. Enforced Values: Groups that must have a specific handle value.

  4. Stored in: handle_group_to_value (group_id → value)

  5. Blocked Handles: Individual handles that must be zero (deleted).

  6. Stored in: handle_blocks (list of keys)

Handle keys use the convention: (slat_name, position, helix_side) Example: ('layer1-slat5', 3, 2) means position 3 on H2 side of layer1-slat5.

All group IDs are numeric integers, persisted to file on export.

add_block(key)

Adds a block to a handle.

PARAMETER DESCRIPTION
key

(slat_name, position, side)

remove_block(key)

Removes a block from a handle.

PARAMETER DESCRIPTION
key

(slat_name, position, side)

remove_link(key)

Removes a link from a handle.

PARAMETER DESCRIPTION
key

(slat_name, position, side)

remove_group(group_id)

Removes an entire handle link group.

PARAMETER DESCRIPTION
group_id

Group ID to remove

add_link(key_1, key_2)

Adds a link between two handles.

PARAMETER DESCRIPTION
key_1

(slat_name, position, side)

key_2

(slat_name, position, side)

Slat Design

crisscross.core_functions.slat_design

generate_standard_square_slats

generate_standard_square_slats(slat_count=32)

Generates a base array for a square megastructure design

PARAMETER DESCRIPTION
slat_count

Number of handle positions in each slat

DEFAULT: 32

RETURNS DESCRIPTION

3D numpy array with x/y slat positions and a list of the unique slat IDs for each layer

read_design_from_excel

read_design_from_excel(folder, sheets)

Reads a megastructure design pre-populated in excel. 0s indicate no slats, all other numbers are slat IDs

PARAMETER DESCRIPTION
folder

Folder containing excel sheets

sheets

List of sheets containing the positions of slats (in order, starting from the bottom layer)

RETURNS DESCRIPTION

3D numpy array with x/y slat positions

attach_cargo_handles_to_core_sequences

attach_cargo_handles_to_core_sequences(
    pattern, sequence_map, target_plate, slat_type="X", handle_side=2
)

TODO: extend for any shape (currently only 2D squares) Concatenates cargo handles to provided sequences according to cargo pattern.

PARAMETER DESCRIPTION
pattern

2D array showing where each cargo handle will be attached to a slat

sequence_map

Sequence to use for each particular cargo handle in pattern

target_plate

Plate class containing all the pre-mapped sequences/wells for the selected slats

slat_type

Type of slat to attach to (X or Y)

DEFAULT: 'X'

handle_side

H2 or H5 handle position

DEFAULT: 2

RETURNS DESCRIPTION

Dataframe containing all new sequences to be ordered for cargo attachment

generate_patterned_square_cco

generate_patterned_square_cco(pattern='2_slat_jump')

Pre-generates a square megastructure with specific repeating patterns.

PARAMETER DESCRIPTION
pattern

Choose from the set of available pre-made patterns.

DEFAULT: '2_slat_jump'

RETURNS DESCRIPTION

2D array containing pattern TODO: make size of square adjustable

Plate Handling

crisscross.core_functions.plate_handling

add_data_to_plate_df

add_data_to_plate_df(letters, column_total, data_dict)

Creates an empty plate (i.e. with rows/columns premade) and inserts provided data dict, leaving all empty cells blank.

PARAMETER DESCRIPTION
letters

Letters to use for rows.

column_total

Total amount of columns (numbers)

data_dict

Nested dictionary containing data to input into plate (keys are letters, values are dictionaries with numbers as keys)

RETURNS DESCRIPTION

Updated plate

read_dna_plate_mapping

read_dna_plate_mapping(filename, data_type='2d_excel', plate_size=384)

Reads a DNA plate mapping file and returns a dataframe with the data.

PARAMETER DESCRIPTION
filename

Filename to read (full path)

data_type

Type of data - currently only supports 2d_excel and IDT_order

DEFAULT: '2d_excel'

plate_size

Either 96 or 384-well plate sizes

DEFAULT: 384

RETURNS DESCRIPTION

Dataframe containing all data

generate_new_plate_from_slat_handle_df

generate_new_plate_from_slat_handle_df(
    data_df,
    folder,
    filename,
    restart_row_by_column=None,
    data_type="2d_excel",
    plate_size=384,
    plate_name=None,
    scramble_names=False,
    output_generic_cargo_plate_mapping=False,
)

Generates a new plate from a dataframe containing sequences, names and notes, then saves it to file. TODO: make faster and more elegant.

PARAMETER DESCRIPTION
data_df

Main sequence data to export containing sequence, name and description.

folder

Output folder.

filename

Output filename.

restart_row_by_column

Set this to a column name to restart the row number when the value changes.

DEFAULT: None

data_type

Either 2d_excel (2d output array) or IDT_order (for IDT order form).

DEFAULT: '2d_excel'

plate_size

96 or 384

DEFAULT: 384

plate_name

Name of the plate (used for IDT order form)

DEFAULT: None

scramble_names

If true, scrubs all identifiable names when preparing an IDT order output

DEFAULT: False

output_generic_cargo_plate_mapping

If true, outputs the generic cargo plate mapping naming schemes to the 'names' sheet so that the generic cargo plate system can be used directly from the generated file

DEFAULT: False

RETURNS DESCRIPTION

final dataframe that's saved to file

Graphics and Visualization

PyVista 3D

crisscross.graphics.pyvista_3d

rounded_polyline

rounded_polyline(
    points, corner_fraction=0.45, samples_per_corner=16, uturn_angle_threshold_deg=170.0
)

Replace interior corners with fillets; handles three cases: - nearly-straight: pass-through - normal corner: quadratic Bezier fillet (A..P1..B) - near-180° (U-turn): generate a true semicircular arc across p1 Returns np.array((M,3)).

parallel_transport_frames

parallel_transport_frames(points)

Compute tangent, normal, binormal at each point using parallel transport. Returns (tangents, normals, binormals) arrays same length as points.

build_swept_tube

build_swept_tube(curve_pts, radius, n_sides=32, cap_ends=False)

Build a tube mesh by sweeping a circle along curve_pts using parallel-transport frames. Returns a pyvista.PolyData mesh.

create_graphical_3D_view

create_graphical_3D_view(
    slat_array,
    slats,
    save_folder,
    layer_palette,
    cargo_palette=None,
    connection_angle="90",
    window_size=(2048, 2048),
    filename_prepend="",
)

Creates a 3D video of a megastructure slat design.

PARAMETER DESCRIPTION
slat_array

A 3D numpy array with x/y slat positions (slat ID placed in each position occupied)

slats

Dictionary of slat objects

save_folder

Folder to save all video to.

layer_palette

Dictionary of layer information (e.g. top/bottom helix and colors), where keys are layer numbers.

cargo_palette

Dictionary of cargo information (e.g. colors), where keys are cargo types.

DEFAULT: None

connection_angle

The angle of the slats in the design (either '90' or '60' for now).

DEFAULT: '90'

window_size

Resolution of video generated. 2048x2048 seems reasonable in most cases.

DEFAULT: (2048, 2048)

filename_prepend

String to prepend to the filename of the video.

DEFAULT: ''

RETURNS DESCRIPTION

N/A

Blender 3D

crisscross.graphics.blender_3d

look_at

look_at(obj, target)

Points the provided object towards the target vector

PARAMETER DESCRIPTION
obj

Blender object (typically a camera or light)

target

The target 3D vector

RETURNS DESCRIPTION

N/A

srgb_to_linear

srgb_to_linear(c)

Convert an sRGB tuple (0–1) to linear RGB (0–1) for Blender.

create_slat_material

create_slat_material(color, mat_name, metallic_strength=0.8, alpha_animation=False)
PARAMETER DESCRIPTION
color

RGB color code (4-value, with the last value being the alpha).

mat_name

The name to assign to the material.

metallic_strength

How metallic the final material should be (default is pretty metallic).

DEFAULT: 0.8

alpha_animation

Set to True to enable alpha animation (for slat wipe-in animations).

DEFAULT: False

RETURNS DESCRIPTION

The complete material object.

set_slat_wipe_in_animation

set_slat_wipe_in_animation(
    frame_start,
    frame_end,
    slat_id,
    slat_cylinder,
    slat_center,
    slat_rotation,
    slat_length,
    hide_cube=True,
)

Sets up the animation for a single slat, which involves creating a cuboid and slowly covering the slat with the cuboid.

PARAMETER DESCRIPTION
frame_start

The frame from which to start the animation

frame_end

The frame at which the animation ends

slat_id

The slat's name

slat_cylinder

The slat cylinder object pre-created in Blender

slat_center

The center of the slat

slat_rotation

The slat's orientation

slat_length

The slat's length

hide_cube

Set to true to hide the cover-cube from the viewport

DEFAULT: True

RETURNS DESCRIPTION

N/A

set_slat_translate_animation

set_slat_translate_animation(
    frame_start,
    frame_end,
    slat_cylinder,
    slat_center,
    slat_rotation,
    slat_length,
    extension_length=2.0,
)

Sets up a translation-based entry animation for a slat.

PARAMETER DESCRIPTION
frame_start

The frame from which to start the animation.

frame_end

The frame at which the animation ends.

slat_cylinder

The slat cylinder object pre-created in Blender.

slat_center

The center of the slat.

slat_rotation

The slat's orientation.

slat_length

The slat's length.

extension_length

The distance that the slat will move to complete the animation.

DEFAULT: 2.0

RETURNS DESCRIPTION

N/A

check_slat_animation_direction

check_slat_animation_direction(
    start_point,
    end_point,
    current_slat_id,
    current_layer,
    slats,
    animate_slat_group_dict,
)

Attempts a quick check to prevent slats from appearing 'out of thin air' but rather from a top/bottom support.

PARAMETER DESCRIPTION
start_point

The current slat start position

end_point

The current slat end position

current_slat_id

The current slat ID (name)

current_layer

The slat's layer

slats

The dict of all slats in the design

animate_slat_group_dict

The dictionary of slat animation groups, in order

RETURNS DESCRIPTION

The new start and end position for the slat animation

interpret_cargo_system

interpret_cargo_system(
    slats,
    layer_palette,
    grid_xd,
    grid_yd,
    slat_width,
    cargo_materials,
    frame_start=0,
    frame_end=0,
)

Interprets the cargo dict and places cargo stubs in the 3D scene, along with an animation if requested. Required to generate cargo cylinders.

PARAMETER DESCRIPTION
slats

The dict of all slats in the design

layer_palette

Dictionary of layer information (e.g. top/bottom helix and colors), where keys are layer numbers.

grid_xd

The grid x-jump distance

grid_yd

The grid y-jump distance

slat_width

The width of the slat, used to determine the radius of the cargo cylinders.

cargo_materials

Dictionary of cargo materials, where keys are cargo types and values are the material objects.

frame_start

Animation start frame

DEFAULT: 0

frame_end

Animation end frame. If frame_start=frame_end, no animation will be added.

DEFAULT: 0

RETURNS DESCRIPTION

N/A

interpret_seed_system

interpret_seed_system(slats, layer_palette, seed_material, grid_xd, grid_yd)

Interprets the seed array and places seed cylinders in the Blender scene. Makes the assumption that np.where can correctly figure out where each seed cylinder starts/stops. If there are errors here, this will need to be fixed.

PARAMETER DESCRIPTION
slats

The dict of all slats in the design

layer_palette

Dictionary of layer information (e.g. top/bottom helix and colors), where keys are layer numbers.

seed_material

The material to use for the seed

grid_xd

The grid x-jump distance

grid_yd

The grid y-jump distance

RETURNS DESCRIPTION

N/A

create_graphical_3D_view_bpy

create_graphical_3D_view_bpy(
    slat_array,
    slats,
    layer_palette,
    save_folder,
    cargo_palette=None,
    animate_slat_group_dict=None,
    animate_delay_frames=40,
    connection_angle="90",
    camera_spin=False,
    animation_type="translate",
    specific_slat_translate_distances=None,
    correct_slat_entrance_direction=True,
    force_slat_color_by_layer=True,
    slat_flip_list=None,
    include_bottom_light=False,
    filename_prepend="",
)

Creates a 3D video of a megastructure slat design.

PARAMETER DESCRIPTION
slat_array

A 3D numpy array with x/y slat positions (slat ID placed in each position occupied)

slats

Dictionary of slat objects

layer_palette

Dictionary of layer information (e.g. top/bottom helix and colors), where keys are layer numbers.

save_folder

Folder to save all video to.

cargo_palette

Dictionary of cargo information, where keys are cargo types and values are dictionaries with color and other properties.

DEFAULT: None

animate_slat_group_dict

Dictionary of slat IDs and the group they belong to for animation

DEFAULT: None

animate_delay_frames

Number of frames to delay between each slat group animation

DEFAULT: 40

connection_angle

The angle of the slats in the design (either '90' or '60' for now).

DEFAULT: '90'

camera_spin

Set to True to have the camera spin around the design

DEFAULT: False

animation_type

The type of animation to use for slat entry ('translate' or 'wipe_in')

DEFAULT: 'translate'

specific_slat_translate_distances

The distance each slat will move if using the translate system. This should be in a dictionary format - not all slat needs to have the distance, only those that will be different from the default value of 2.

DEFAULT: None

correct_slat_entrance_direction

If set to true, will attempt to correct the slat entrance animation to always start from a place that is supported.

DEFAULT: True

force_slat_color_by_layer

Forces a slat to be colored by layer rather than by animation group (if animation is on).

DEFAULT: True

slat_flip_list

List of slat IDs - if a slat is in this list, its animation direction will be flipped. This cannot be used in conjunction with the correct_slat_entrance_direction parameter.

DEFAULT: None

include_bottom_light

Set to True to include a light source at the bottom of the design.

DEFAULT: False

filename_prepend

String to prepend to the filename of the video.

DEFAULT: ''

RETURNS DESCRIPTION

N/A

Static Plots

crisscross.graphics.static_plots

slat_axes_setup

slat_axes_setup(slat_array, axis, grid_xd, grid_yd, reverse_y=False)

Prepares a matplotlib axis for slat plotting, making sure to extend limits to fit input deisgn.

PARAMETER DESCRIPTION
slat_array

3D numpy array with x/y slat positions (slat ID placed in each position occupied).

axis

Axes to adjust.

grid_xd

X scaling factor for x values.

grid_yd

Y scaling factor for y values.

reverse_y

Set to true to reverse y axis (useful for side profile views).

DEFAULT: False

RETURNS DESCRIPTION

n/a

physical_point_scale_convert

physical_point_scale_convert(point, grid_xd, grid_yd)

Converts different grid scaling system into an actual 'physical' measure, w.r.t. distance between handles.

PARAMETER DESCRIPTION
point

Input point (x,y) to convert

grid_xd

Scaling factor for x value

grid_yd

Scaling factor for y value

RETURNS DESCRIPTION

Scaled point (x,y)

points_per_data

points_per_data(ax)

Return points-per-data-unit (x, y) for an axes, given current limits and layout. Make sure the figure is drawn before calling this (fig.canvas.draw()).

create_graphical_slat_view

create_graphical_slat_view(
    slat_array,
    layer_palette,
    cargo_palette=None,
    include_seed=True,
    include_cargo=True,
    slats=None,
    save_to_folder=None,
    instant_view=True,
    filename_prepend="",
    connection_angle="90",
)

Creates a graphical view of all slats in the assembled design, including cargo and seed handles. A single figure is created for the global view of the structure, as well as individual figures for each layer in the design.

PARAMETER DESCRIPTION
slat_array

A 3D numpy array with x/y slat positions (slat ID placed in each position occupied)

layer_palette

Dictionary of layer information (e.g. top/bottom helix and colors), where keys are layer numbers.

cargo_palette

Dictionary of cargo information (e.g. colors), where keys are cargo types.

DEFAULT: None

include_seed

Set to True to include seed handles in the graphical view.

DEFAULT: True

include_cargo

Set to True to include cargo handles in the graphical view.

DEFAULT: True

slats

Dictionary of slat objects (if not provided, will be generated from slat_array)

DEFAULT: None

save_to_folder

Set to the filepath of a folder where all figures will be saved.

DEFAULT: None

instant_view

Set to True to plot the figures immediately to your active view.

DEFAULT: True

filename_prepend

String to prepend to generated files.

DEFAULT: ''

connection_angle

The angle of the slats in the design (either '90' or '60' for now).

DEFAULT: '90'

RETURNS DESCRIPTION

N/A

create_graphical_assembly_handle_view

create_graphical_assembly_handle_view(
    slat_array,
    handle_arrays,
    layer_palette,
    slats=None,
    save_to_folder=None,
    connection_angle="90",
    filename_prepend="",
    instant_view=True,
)

Creates a graphical view of all handles in the assembled design, along with a side profile.

PARAMETER DESCRIPTION
slat_array

A 3D numpy array with x/y slat positions (slat ID placed in each position occupied)

handle_arrays

A 3D numpy array with x/y handle positions (handle ID placed in each position occupied)

layer_palette

slats

Dictionary of slat objects (if not provided, will be generated from slat_array)

DEFAULT: None

save_to_folder

Set to the filepath of a folder where all figures will be saved.

DEFAULT: None

connection_angle

The angle of the slats in the design (either '90' or '60' for now).

DEFAULT: '90'

filename_prepend

String to prepend to generated files.

DEFAULT: ''

instant_view

Set to True to plot the figures immediately to your active view.

DEFAULT: True

RETURNS DESCRIPTION

N/A

Slat Handle Match Evolver

Handle Evolution

crisscross.slat_handle_match_evolver.handle_evolution

EvolveManager

EvolveManager(
    megastructure: Megastructure,
    random_seed=8,
    generational_survivors=3,
    mutation_rate=5,
    mutation_type_probabilities=(0.425, 0.425, 0.15),
    unique_handle_sequences=32,
    evolution_generations=200,
    evolution_population=30,
    split_sequence_handles=False,
    sequence_split_factor=2,
    process_count=None,
    early_max_valency_stop=None,
    log_tracking_directory=None,
    progress_bar_update_iterations=2,
    mutation_memory_system="off",
    memory_length=10,
    repeating_unit_constraints=None,
    similarity_score_calculation_frequency=10,
    update_scope="interfaces",
)

Prepares an evolution manager to optimize a handle array for the provided slat array. WARNING: Make sure to use the "if name == 'main':" block to run this class in a script. Otherwise, the spawned processes will cause a recursion error.

PARAMETER DESCRIPTION
megastructure

Megastructure object containing slat array and optionally a seed handle array for starting evolution

TYPE: Megastructure

random_seed

Random seed to use to ensure consistency

DEFAULT: 8

generational_survivors

Number of surviving candidate arrays that persist through each generation

DEFAULT: 3

mutation_rate

The expected number of mutations per iteration

DEFAULT: 5

mutation_type_probabilities

Probability of selecting a specific mutation type for a target handle/antihandle (either handle, antihandle or mixed mutations)

DEFAULT: (0.425, 0.425, 0.15)

unique_handle_sequences

Handle library length

DEFAULT: 32

evolution_generations

Number of generations to consider before stopping

DEFAULT: 200

evolution_population

Number of handle arrays to mutate in each generation

DEFAULT: 30

split_sequence_handles

Set to true to enforce the splitting of handle sequences between subsequent layers

DEFAULT: False

sequence_split_factor

Factor by which to split the handle sequences between layers (default is 2, which means that if handles are split, the first layer will have 1/2 of the handles, etc.)

DEFAULT: 2

process_count

Number of threads to use for hamming multiprocessing (if set to default, will use 67% of available cores)

DEFAULT: None

early_max_valency_stop

If this worst match score is achieved, the evolution will stop early

DEFAULT: None

log_tracking_directory

Set to a directory to export plots and metrics during the optimization process (optional)

DEFAULT: None

progress_bar_update_iterations

Number of iterations before progress bar is updated - useful for server output files, but does not seem to work consistently on every system (optional)

DEFAULT: 2

mutation_memory_system

The type of memory system to use for the handle mutation process. Options are 'all', 'best', 'special', or 'off'.

DEFAULT: 'off'

memory_length

Memory of previous 'worst' handle combinations to retain when selecting positions to mutate.

DEFAULT: 10

repeating_unit_constraints

Dictionary to define handle link constraints on handle mutations (mainly for use with repeating unit designs). This is largely depracted now in favour of the Megastructure recursive handle linking system.

DEFAULT: None

similarity_score_calculation_frequency

The duplication risk score will be calculated every x generations. This helps speed up the evolution process, since it isn't used for deciding on the best generations to retain.

DEFAULT: 10

update_scope

Either 'interfaces' (default) to only place handles at layer interfaces, or 'all' to also place handles at positions with existing handles in the seed array.

DEFAULT: 'interfaces'

initialize_evolution

initialize_evolution()

Initializes the pool of candidate handle arrays.

single_evolution_step

single_evolution_step()

Performs a single evolution step, evaluating all candidate arrays and preparing new mutations for the next generation.

RETURNS DESCRIPTION

run_full_experiment

run_full_experiment(
    logging_interval=10, suppress_handle_array_export=False, save_first=False
)

Runs a full evolution experiment.

PARAMETER DESCRIPTION
logging_interval

The frequency at which logs should be written to file (including the best hamming array file).

DEFAULT: 10

suppress_handle_array_export

If true, the best handle array will not be exported at each logging interval, saving space.

DEFAULT: False

save_first

If true, saves the best handle array from the initial random population as 'best_handle_array_initial.xlsx'.

DEFAULT: False

ensure_pool

ensure_pool()

Create a new Pool if none exists or it is not RUN. Safe to call at the beginning of each generation or on resume.

Tubular Slat Match Compute

crisscross.slat_handle_match_evolver.tubular_slat_match_compute

extract_handle_dicts

extract_handle_dicts(
    handle_array, slat_array, list_indices_only=False, custom_index_map=None
)

Extracts all slats from a design and organizes them into dictionaries of handles and antihandles. If list_indices_only is set to True, the function will return lists of indices instead of the actual handle values.

PARAMETER DESCRIPTION
handle_array

Array of XxYxZ-1 dimensions containing the IDs of all the handles in the design

slat_array

Array of XxYxZ dimensions containing the positions of all slats in the design

list_indices_only

Set to True to return lists of indices instead of the actual handle values

DEFAULT: False

custom_index_map

A dictionary mapping (layer, slat_id) tuples to specific indices in the handle_array.

DEFAULT: None

RETURNS DESCRIPTION

Two dictionaries or lists containing the handle sequences (or indices) for each slat - one for handles and one for antihandles

oneshot_hamming_compute

oneshot_hamming_compute(handle_dict, antihandle_dict, slat_length)

Given a dictionary of slat handles and antihandles, this function computes the hamming distance between all possible combinations. This is the fastest implementation available, making full use of Numpy's efficient vector computation.

PARAMETER DESCRIPTION
handle_dict

Dictionary of handles i.e. {slat_id: slat_handle_array}

antihandle_dict

Dictionary of antihandles i.e. {slat_id: slat_antihandle_array}

slat_length

The length of a single slat (must be an integer)

RETURNS DESCRIPTION

Array of noflank_results for each possible combination (a single integer per combination)

multirule_oneshot_hamming

multirule_oneshot_hamming(
    slat_array,
    handle_array,
    report_worst_slat_combinations=True,
    per_layer_check=False,
    specific_slat_groups=None,
    request_substitute_risk_score=False,
    slat_length=32,
    partial_area_score=False,
    return_match_histogram=False,
)

Given a slat and handle array, this function computes the hamming distance of all handle/antihandle combinations provided. Scores for individual components, such as specific slat groups, can also be requested. Note: This function is our current fastest implementation. Due to its optimization, it cannot be instructed to compute lesser combinations of slats in the design - it will compute all hamming noflank_results at once.

Implementation comment: Requesting the similarity score requires that more hamming combinations are computed. This will of course slow down the function by a factor of 3 (approx). We tried to reduce this speed loss by combining the handles and antihandles into a duplicate array, which computes the following combinations: - handle vs handle - antihandle vs antihandle - handle vs antihandle (duplicated twice) However, the end result was a minor slowdown rather than speed increase. We suspect that the additional computations forced by the duplicated handle vs antihandle combination outweights the speed increase by computing the arrays all in one go. It might be possible to improve this further, but we do not think the solution is obvious (or potentially worth the time investment).

PARAMETER DESCRIPTION
slat_array

Array of XxYxZ dimensions, where X and Y are the dimensions of the design and Z is the number of layers in the design

handle_array

Array of XxYxZ-1 dimensions containing the IDs of all the handles in the design

report_worst_slat_combinations

Set to true to provide the IDs of the worst handle/antihandle slat combinations

DEFAULT: True

per_layer_check

Set to true to provide a hamming score for the individual layers of the design (i.e. the interface between each layer)

DEFAULT: False

specific_slat_groups

Provide a dictionary, where the key is a group name and the value is a list of tuples containing the layer and slat ID of the slats in the group for which the specific hamming distance is being requested.

DEFAULT: None

slat_length

The length of a single slat (must be an integer)

DEFAULT: 32

request_substitute_risk_score

Set to true to provide a measure of the largest amount of handle duplication between slats of the same type (handle or antihandle)

DEFAULT: False

partial_area_score

Calculates Hamming distance and substitution risk among a subset of provided slats when only considering a subset of the handles. Provide a dictionary with key as a group name and the values as dictionarys with keys "handle" and "antihandle". The corresponding values are dictionaries, where the key is a tuple like so (slat layer, slat ID) and the value is a list of TRUE/FALSE depending on whether that position's handle is included.

DEFAULT: False

return_match_histogram

If True, returns a histogram of the number of matches of each type (0 matches, 1 match, etc.).

DEFAULT: False

RETURNS DESCRIPTION

Dictionary of scores (or slat layer/handle IDS for the worst slat combinations) for each of the slat combinations requested from the design

precise_hamming_compute

precise_hamming_compute(
    handle_dict, antihandle_dict, valid_product_indices, slat_length
)

Given a dictionary of slat handles and antihandles, this function computes the hamming distance between all possible combinations. It can be restricted to certain combinations by providing a list of valid product indices. This is not the fastest hamming implementation available, but does use numpy vectorization to reduce runtime.

PARAMETER DESCRIPTION
handle_dict

Dictionary of handles i.e. {slat_id: slat_handle_array}

antihandle_dict

Dictionary of antihandles i.e. {slat_id: slat_antihandle_array}

valid_product_indices

A list of indices matching the possible products that should be computed (i.e. if a product is not being requested, the index should be False)

slat_length

The length of a single slat (must be an integer)

RETURNS DESCRIPTION

Array of noflank_results for each possible combination (a single integer per combination)

multirule_precise_hamming

multirule_precise_hamming(
    slat_array,
    handle_array,
    universal_check=True,
    per_layer_check=False,
    specific_slat_groups=None,
    slat_length=32,
    request_substitute_risk_score=False,
    partial_area_score=False,
)

Given a slat and handle array, this function computes the hamming distance of all handle/antihandle combinations provided. Scores for individual components, such as specific slat groups, can also be requested. This does not use the fastest hamming calculation implementation, but can be configured to restrict the number of combinations computed.

PARAMETER DESCRIPTION
slat_array

Array of XxYxZ dimensions, where X and Y are the dimensions of the design and Z is the number of layers in the design

handle_array

Array of XxYxZ-1 dimensions containing the IDs of all the handles in the design

universal_check

Set to true to provide a hamming score for the entire set of slats in the design

DEFAULT: True

per_layer_check

Set to true to provide a hamming score for the individual layers of the design (i.e. the interface between each layer)

DEFAULT: False

specific_slat_groups

Provide a dictionary, where the key is a group name and the value is a list of tuples containing the layer and slat ID of the slats in the group for which the specific hamming distance is being requested.

DEFAULT: None

slat_length

The length of a single slat (must be an integer)

DEFAULT: 32

request_substitute_risk_score

Set to true to provide a measure of the largest amount of handle duplication between slats of the same type (handle or antihandle)

DEFAULT: False

partial_area_score

Calculates Hamming distance and substitution risk among a subset of provided slats when only considering a subset of the handles. Provide a dictionary with key as a group name and the values as dictionaries with keys "handle" and "antihandle". The corresponding values are dictionaries, where the key is a tuple like so (slat layer, slat ID) and the value is a list of TRUE/FALSE depending on whether that position's handle is included.

DEFAULT: False

RETURNS DESCRIPTION

Dictionary of scores for each of the aspects requested from the design

Handle Mutation

crisscross.slat_handle_match_evolver.handle_mutation

mutate_handle_arrays

mutate_handle_arrays(
    slat_array,
    candidate_handle_arrays,
    hallofshame,
    best_score_indices,
    unique_sequences=32,
    memory_hallofshame=None,
    memory_best_parent_hallofshame=None,
    special_hallofshame=None,
    mutation_rate=2.0,
    mutation_type_probabilities=(0.425, 0.425, 0.15),
    use_memory_type=None,
    split_sequence_handles=False,
    sequence_split_factor=2,
    repeating_unit_constraints=None,
    mutation_mask=None,
)

Mutates (randomizes handles) a set of candidate arrays into a new generation, while retaining the best scoring arrays from the previous generation.

PARAMETER DESCRIPTION
slat_array

Base slat array for design

candidate_handle_arrays

Set of candidate handle arrays from previous generation

hallofshame

Worst handle/antihandle combinations from previous generation

best_score_indices

The indices of the best scoring arrays from the previous generation

unique_sequences

Total length of handle library available

DEFAULT: 32

memory_hallofshame

List of all the worst handle/antihandle combinations from previous generations

DEFAULT: None

memory_best_parent_hallofshame

List of the worst handle/antihandle combinations from previous generations (only the ones linked to the best parents)

DEFAULT: None

special_hallofshame

List of special handle/antihandle combinations from previous generations

DEFAULT: None

mutation_rate

The expected number of mutations per cycle

DEFAULT: 2.0

mutation_type_probabilities

Probability of selecting a specific mutation type for a target handle/antihandle (either handle, antihandle or mixed mutations)

DEFAULT: (0.425, 0.425, 0.15)

use_memory_type

Select memory type to use for mutation selection ('off', 'all', 'best_memory', 'special')

DEFAULT: None

split_sequence_handles

Set to true if the handle library needs to be split between subsequent layers

DEFAULT: False

sequence_split_factor

The number of layers to split the handle library between (default is 2, which means a single layer would have half the available library)

DEFAULT: 2

repeating_unit_constraints

Dictionary of handles to link together (mostly deprecated in favour of recursive algorithm in Megastructure). Syntax is {(layer, 'top'/'bottom', (x,y)): (layer, 'top'/'bottom', (x,y))}

DEFAULT: None

mutation_mask

Optional integer mask to inform mutation system of any handle links in the slat design (created through recursion system)

DEFAULT: None

RETURNS DESCRIPTION

New generation of handle arrays to be screened

Random Hamming Optimizer

crisscross.slat_handle_match_evolver.random_hamming_optimizer

generate_handle_set_and_optimize

generate_handle_set_and_optimize(
    base_array,
    unique_sequences=32,
    slat_length=32,
    max_rounds=30,
    split_sequence_handles=False,
    universal_hamming=True,
    layer_hamming=False,
    group_hamming=None,
    metric_to_optimize="Universal",
)

Generates random handle sets and attempts to choose the best set based on the hamming distance between slat assembly handles.

PARAMETER DESCRIPTION
base_array

Slat position array (3D)

unique_sequences

Max unique sequences in the handle array

DEFAULT: 32

slat_length

Length of a single slat

DEFAULT: 32

max_rounds

Maximum number of rounds to run the check

DEFAULT: 30

split_sequence_handles

Set to true to split the handle sequences between layers evenly

DEFAULT: False

universal_hamming

Set to true to compute the hamming distance for the entire set of slats

DEFAULT: True

layer_hamming

Set to true to compute the hamming distance for the interface between each layer

DEFAULT: False

group_hamming

Provide a dictionary, where the key is a group name and the value is a list of tuples containing the layer and slat ID of the slats in the group for which the specific hamming distance is being requested.

DEFAULT: None

metric_to_optimize

The metric to optimize for (Universal, Layer X or Group ID)

DEFAULT: 'Universal'

RETURNS DESCRIPTION

2D array with handle IDs

Handle Evolution with Optuna

crisscross.slat_handle_match_evolver.handle_evolve_with_optuna

OptunaEvolveManager

OptunaEvolveManager(optuna_trial, **kwargs)

Bases: EvolveManager

An EvolveManager that is designed to be used with Optuna for optimizing the best hyperparameters. Inherits all other methods from the typical evolution manager.

run_full_experiment

run_full_experiment(logging_interval=10)

Runs a full evolution experiment.

PARAMETER DESCRIPTION
logging_interval

The frequency at which logs should be written to file (including the best hamming array file).

DEFAULT: 10

initialize_evolution

initialize_evolution()

Initializes the pool of candidate handle arrays.

single_evolution_step

single_evolution_step()

Performs a single evolution step, evaluating all candidate arrays and preparing new mutations for the next generation.

RETURNS DESCRIPTION

ensure_pool

ensure_pool()

Create a new Pool if none exists or it is not RUN. Safe to call at the beginning of each generation or on resume.

Plate Mapping

Hash CAD Plates

crisscross.plate_mapping.hash_cad_plates

HashCadPlate

HashCadPlate(
    plate_name,
    plate_folder,
    pre_read_plate_dfs=None,
    plate_size=384,
    apply_working_stock_concentration=None,
)

A standardized plate system that defines the same ID pattern for all possible handle types (cargo, assembly, seed, etc.). This same system can be read into #-CAD directly. Each plate also features a 2D layout tab to help visualize in the lab.

identify_wells_and_sequences

identify_wells_and_sequences()

Identifies wells and sequences from the plate data and combines them into one database.

get_sequence

get_sequence(category, slat_position, slat_side, cargo_id=0)

To retrieve data from the database, the following keys are required: - category: 'cargo', 'assembly', 'seed', etc. - slat_position: physical slat handle ID i.e. 1, 2, ..., 32 - slat_side: i.e. 2, 5 (for h2/h5 handles) - cargo_id: e.g., 'A', 'B', ..., 'Z' (for cargo plates) or 1-64 for assembly handles

get_well

get_well(category, slat_position, slat_side, cargo_id=0)

To retrieve data from the database, the following keys are required: - category: 'cargo', 'assembly', 'seed', etc. - slat_position: physical slat handle ID i.e. 1, 2, ..., 32 - slat_side: i.e. 2, 5 (for h2/h5 handles) - cargo_id: e.g., 'A', 'B', ..., 'Z' (for cargo plates) or 1-64 for assembly handles

get_concentration

get_concentration(category, slat_position, slat_side, cargo_id=0)

To retrieve data from the database, the following keys are required: - category: 'cargo', 'assembly', 'seed', etc. - slat_position: physical slat handle ID i.e. 1, 2, ..., 32 - slat_side: i.e. 2, 5 (for h2/h5 handles) - cargo_id: e.g., 'A', 'B', ..., 'Z' (for cargo plates) or 1-64 for assembly handles

get_plate_name

get_plate_name(category, slat_position, slat_side, cargo_id)

To retrieve data from the database, the following keys are required: - category: 'cargo', 'assembly', 'seed', etc. - slat_position: physical slat handle ID i.e. 1, 2, ..., 32 - slat_side: i.e. 2, 5 (for h2/h5 handles) - cargo_id: e.g., 'A', 'B', ..., 'Z' (for cargo plates) or 1-64 for assembly handles

Plate Constants

crisscross.plate_mapping.plate_constants

sanitize_plate_map

sanitize_plate_map(name)

Actual plate name for the Echo always just features the person's name and the plate ID.

PARAMETER DESCRIPTION
name

Long-form plate name

RETURNS DESCRIPTION

Barebones plate name for Echo

Non-Standard Plates

Cargo Plates

crisscross.plate_mapping.non_standard_plates.cargo_plates

GenericPlate

GenericPlate(*args, **kwargs)

Bases: BasePlate

A generic cargo plate system that can read in any plate file with the handle-position-cargo syntax defined in the top left cell. ID numbers are assigned to cargo at run-time.

Core Plates

crisscross.plate_mapping.non_standard_plates.core_plates

ControlPlate

ControlPlate(*args, **kwargs)

Bases: BasePlate

Core control plate containing slat sequences and flat (not jutting out) H2/H5 staples.

ControlPlateWithDuplicates

ControlPlateWithDuplicates(*args, **kwargs)

Bases: BasePlate

Core control plate containing slat sequences and flat (not jutting out) H2/H5 staples. This format allows for duplicated source wells, which should help combat Echo errors.

Crisscross Plates

crisscross.plate_mapping.non_standard_plates.crisscross_plates

CrisscrossHandlePlates

CrisscrossHandlePlates(*args, plate_slat_sides, **kwargs)

Bases: BasePlate

Mix of multiple plates containing all possible 32x32x2 combinations of crisscross handles. In this system, cargo ID = assembly handle ID.

Seed Plates

crisscross.plate_mapping.non_standard_plates.seed_plates

Hybrid Plates

crisscross.plate_mapping.non_standard_plates.hybrid_plates

HybridPlate

HybridPlate(*args, **kwargs)

Bases: GenericPlate

A generic cargo plate system that can read in any plate file with the handle-position-cargo syntax defined in the top left cell. ID numbers are assigned to cargo at run-time.

Plate Concentrations

crisscross.plate_mapping.non_standard_plates.plate_concentrations

apply_well_exceptions

apply_well_exceptions(complete_echo_df)

Some wells in certain plates have concentrations that differ from the rest in the plate, either due to a lab mistake or some other design consideration. This function applies fixes to the specific wells we have identified, if found in the design.

PARAMETER DESCRIPTION
complete_echo_df

The full echo dataframe

RETURNS DESCRIPTION

Same echo dataframe with patches applied

Hash CAD Plate Converter

crisscross.plate_mapping.non_standard_plates.hash_cad_plate_converter

sanitize_plate_map

sanitize_plate_map(name)

Actual plate name for the Echo always just features the person's name and the plate ID.

PARAMETER DESCRIPTION
name

Long-form plate name

RETURNS DESCRIPTION

Barebones plate name for Echo

apply_name_update

apply_name_update(category, side, position, cargo_id, sp_name, plate, index)

Helper function to apply the new name and description to the plate DataFrame.

Helper Functions

Lab Helper Sheet Generation

crisscross.helper_functions.lab_helper_sheet_generation

next_excel_column_name

next_excel_column_name(n)

Given a 0-based index, return the Excel-style column name.

apply_box_border

apply_box_border(ws, top_left, top_right, bottom_left, bottom_right, style='thick')

Applies a thick border to an excel sheet surrounding the specified cells.

PARAMETER DESCRIPTION
ws

Excel worsheet object.

top_left

Top left cell.

top_right

Top right cell.

bottom_left

Bottom left cell.

bottom_right

Bottom right cell.

style

Border style to be applied.

DEFAULT: 'thick'

RETURNS DESCRIPTION

N/A, applied in-place.

adjust_column_width

adjust_column_width(ws)

Adjusts the column width of an excel sheet based on the maximum length of the content in each column.

PARAMETER DESCRIPTION
ws

Excel sheet object.

RETURNS DESCRIPTION

N/A, adjusted in-place.

prepare_master_mix_sheet

prepare_master_mix_sheet(
    slat_dict,
    echo_sheet=None,
    reference_handle_volume=150,
    reference_handle_concentration=500,
    slat_mixture_volume=50,
    unique_transfer_volume_plates=None,
    workbook=None,
    handle_mix_ratio=10,
    split_core_staple_pools=False,
)

Prepares a 'master mix' sheet to be used for combining slat mixtures with scaffold and core staples into the final slat mixture.

PARAMETER DESCRIPTION
slat_dict

Dictionary of slats with slat names as keys and slat objects as values.

echo_sheet

Exact list of commands sent to the Echo robot for this group of slats.

DEFAULT: None

reference_handle_volume

Reference staple volume for each handle in a pool in nL (this refers to the control handles plate).

DEFAULT: 150

reference_handle_concentration

Reference staple concentration used for the core staples in uM (this refers to the control handles plate). All concentration values will be referenced to this value.

DEFAULT: 500

slat_mixture_volume

Reaction volume (in uL) for a single slat annealing mixture. Can be set to 'max' to use up all available handle mix.

DEFAULT: 50

handle_mix_ratio

Ratio of handle mix concentration to scaffold concentration (default is 10).

DEFAULT: 10

unique_transfer_volume_plates

Plates that have special non-standard volumes. This will be ignored if the echo sheet is provided with the exact details.

DEFAULT: None

workbook

The workbook to which the new excel sheet should be added.

DEFAULT: None

split_core_staple_pools

If True, the core staples will be assumed to have been split into 4 pools (S0, S1, S3, S4).

DEFAULT: False

RETURNS DESCRIPTION

Workbook with new sheet included.

prepare_peg_purification_sheet

prepare_peg_purification_sheet(
    slat_dict,
    groups_per_layer=2,
    max_slat_concentration_uM=2,
    slat_mixture_volume=50,
    workbook=None,
    echo_sheet=None,
    special_slat_groups=None,
    peg_concentration=2,
)

Prepares standard instructions for combining and purifying slat mixtures using PEG purification. Also prepares lists of slat groups as a reference for when in the lab.

PARAMETER DESCRIPTION
slat_dict

Dictionary of slats with slat names as keys and slat objects as values.

groups_per_layer

Number of PEG groups to use per crisscross layer. You might want to adjust this if you have too many slats together in one group.

DEFAULT: 2

max_slat_concentration_uM

Maximum concentration of slats in a combined PEG mixture (in UM) before a warning is triggered.

DEFAULT: 2

slat_mixture_volume

Reaction volume (in uL) for a single slat annealing mixture.

DEFAULT: 50

workbook

The workbook to which the new excel sheet should be added.

DEFAULT: None

echo_sheet

Exact list of commands sent to the Echo robot for this group of slats.

DEFAULT: None

special_slat_groups

IDs of slats that should be separated from the general slat groups and placed in their own group.

DEFAULT: None

peg_concentration

PEG concentration (in terms of X) to be used as the stock solution for the purification step.

DEFAULT: 2

RETURNS DESCRIPTION

Workbook with new sheet included.

prepare_all_standard_sheets

prepare_all_standard_sheets(
    slat_dict,
    save_filepath,
    reference_single_handle_volume=150,
    reference_single_handle_concentration=500,
    slat_mixture_volume=50,
    peg_groups_per_layer=2,
    peg_concentration=2,
    echo_sheet=None,
    max_slat_concentration_uM=2,
    unique_transfer_volume_plates=None,
    special_slat_groups=None,
    handle_mix_ratio=10,
    split_core_staple_pools=False,
)

Prepares a series of excel sheets to aid lab assembler while preparing and purifying slat mixtures.

PARAMETER DESCRIPTION
slat_dict

Dictionary of slats to be assembled (each item in the dict is a Slat Object containing all 64 handles in place)

save_filepath

Output file path for the combined excel workbook

reference_single_handle_volume

Reference staple volume for each handle in a pool in nL (this refers to the control handles plate).

DEFAULT: 150

reference_single_handle_concentration

Reference staple concentration used for the core staples in uM (this refers to the control handles plate). All concentration values will be referenced to this value.

DEFAULT: 500

slat_mixture_volume

Reaction volume (in uL) for a single slat annealing mixture. Can be set to 'max' to use up all available handle mix.

DEFAULT: 50

peg_groups_per_layer

Number of PEG groups to use per crisscross layer. You might want to adjust this if you have too many slats together in one group.

DEFAULT: 2

peg_concentration

PEG concentration (in terms of X) to be used as the stock solution for the purification step.

DEFAULT: 2

echo_sheet

Exact echo commands to use as a reference for calculating slat concentrations.

DEFAULT: None

max_slat_concentration_uM

Maximum concentration of slats in a combined PEG mixture (in UM) before a warning is triggered.

DEFAULT: 2

unique_transfer_volume_plates

Plates that have special non-standard volumes. This will be ignored if the echo sheet is provided with the exact details.

DEFAULT: None

special_slat_groups

IDs of slats that should be separated from the general slat groups and placed in their own group.

DEFAULT: None

handle_mix_ratio

Ratio of handle mix concentration to scaffold concentration (default is 10).

DEFAULT: 10

split_core_staple_pools

If True, the core staples will be assumed to have been split into 4 pools (S0, S1, S3, S4).

DEFAULT: False

RETURNS DESCRIPTION

N/A, file saved directly to disk.

prepare_liquid_handle_plates_multiple_files

prepare_liquid_handle_plates_multiple_files(
    output_directory,
    file_list=None,
    extract_all_from_folder=None,
    target_concentration_uM=1000,
    volume_cap_ul=120,
    target_concentration_per_plate=None,
    max_commands_per_file=None,
    plot_distribution_per_plate=True,
    plate_size="384",
)

Generates resuspension maps for all provided DNA spec files.

PARAMETER DESCRIPTION
output_directory

Output folder to save noflank_results.

file_list

Specific list of filepaths to assess.

DEFAULT: None

extract_all_from_folder

Alternatively, specify a folder and all excel sheets will be extracted from the folder.

DEFAULT: None

target_concentration_uM

Target concentration for plate resuspension.

DEFAULT: 1000

volume_cap_ul

Maximum volume to resuspend (after which the volume is kept constant and the concentration is raised instead).

DEFAULT: 120

target_concentration_per_plate

Set to a dictionary of concentrations per plate if different concentrations are desired for different plates.

DEFAULT: None

max_commands_per_file

Maximum commands that can be taken in by the liquid handler in one go. If this is exceeded, files are split into different components.

DEFAULT: None

plot_distribution_per_plate

If true, generate a volume distribution plot for each plate.

DEFAULT: True

plate_size

Specify the size of the plate to generate.

DEFAULT: '384'

RETURNS DESCRIPTION

N/A

prepare_liquid_handler_plate_resuspension_map

prepare_liquid_handler_plate_resuspension_map(
    filename,
    output_directory,
    target_concentration_uM=1000,
    volume_cap_ul=120,
    max_commands_per_file=None,
    plot_distribution_per_plate=True,
    target_concentration_per_plate=None,
    plate_size="384",
)

Generates a visual plate map and resuspension instructions for an entire plate of DNA oligos. The amount of DNA per well should be specified in an excel file using the standard IDT format.

PARAMETER DESCRIPTION
filename

Excel file containing DNA spec sheets (can contain multiples plates per sheet).

output_directory

Output folder to save noflank_results.

target_concentration_uM

Target concentration for plate resuspension.

DEFAULT: 1000

volume_cap_ul

Maximum volume to resuspend (after which the volume is kept constant and the concentration is raised instead).

DEFAULT: 120

max_commands_per_file

Maximum commands that can be taken in by the liquid handler in one go. If this is exceeded, files are split into different components.

DEFAULT: None

plot_distribution_per_plate

If true, generate a volume distribution plot for each plate.

DEFAULT: True

target_concentration_per_plate

Set to a dictionary of concentrations per plate if different concentrations are desired for different plates.

DEFAULT: None

plate_size

Specify the size of the plate to generate.

DEFAULT: '384'

RETURNS DESCRIPTION

Distribution of volumes generated from the specified file.

visualize_plate_volume_distribution

visualize_plate_volume_distribution(file_output, volume_dist, target_concentration_uM)

Visualizes the distribution of resuspension volumes for a given list.

PARAMETER DESCRIPTION
file_output

Output filename

volume_dist

List of volumes (in ul)

target_concentration_uM

Target concentration used for this particular plate

RETURNS DESCRIPTION

N/A

Simple Plate Visuals

crisscross.helper_functions.simple_plate_visuals

visualize_plate_with_color_labels

visualize_plate_with_color_labels(
    plate_size,
    well_color_dict,
    color_label_dict=None,
    well_label_dict=None,
    plate_title=None,
    save_folder=None,
    save_file=None,
    direct_show=True,
    plate_display_aspect_ratio=1.495,
)

Use this function to create a plate graphic with specific colours placed in each well, and alternatively a unique label for each well too.

PARAMETER DESCRIPTION
plate_size

'384' or '96'

well_color_dict

Dictionary containing colour to be placed in specific wells.

color_label_dict

Legend to use for specific colours (at the bottom of the image)

DEFAULT: None

well_label_dict

Optional label to place in specific wells

DEFAULT: None

plate_title

Title to display above plate

DEFAULT: None

save_folder

Output folder

DEFAULT: None

save_file

Output file name

DEFAULT: None

direct_show

Set to true to display the image directly

DEFAULT: True

plate_display_aspect_ratio

Aspect ratio to use for plate display

DEFAULT: 1.495

RETURNS DESCRIPTION

N/A

Slat Salient Quantities

crisscross.helper_functions.slat_salient_quantities

Standard Sequences

crisscross.helper_functions.standard_sequences

SLURM Process and Run

crisscross.helper_functions.slurm_process_and_run

create_o2_slurm_file

create_o2_slurm_file(
    command,
    num_cpus,
    memory,
    time_length,
    user_email="matthew_aquilina@dfci.harvard.edu",
)

Creates a standard slurm batch file script and then adds the provided command at the end of the script. The script is specifically formatted for the O2 server, but can be adjusted for other servers.

PARAMETER DESCRIPTION
command

The string command to add at the end of the batch file

num_cpus

Num of CPU cores to request

memory

Memory in GB to request

time_length

Time in hours to request (if < 12 hours, will be placed in short partition, otherwise medium)

user_email

The email to which failure notifications will be sent

DEFAULT: 'matthew_aquilina@dfci.harvard.edu'

RETURNS DESCRIPTION

The full slurm batch file script (string)

Orthoseq Generator Module

Core Functions

Sequence Computations

orthoseq_generator.sequence_computations

SequencePairRegistry

SequencePairRegistry(
    length=7,
    fivep_ext="",
    threep_ext="",
    unwanted_substrings=None,
    apply_unwanted_to="core",
    seed=None,
    preselected_cores=None,
)

Stateful generator/registry for DNA sequence pairs.

It generates random core sequences of fixed length, forms the pair (seq, revcom(seq)), applies constraints, and assigns stable integer IDs.

If a generated pair has been seen before, it returns the previously assigned ID instead of creating a new one.

PARAMETER DESCRIPTION
length

Length of the core DNA sequence (without flanks).

TYPE: int DEFAULT: 7

fivep_ext

Optional 5′ flanking sequence prepended to each strand.

TYPE: str DEFAULT: ''

threep_ext

Optional 3′ flanking sequence appended to each strand.

TYPE: str DEFAULT: ''

unwanted_substrings

List of substrings that disqualify a sequence. Example: ["AAAA", "CCCC", "GGGG", "TTTT"].

TYPE: list[str] | None DEFAULT: None

apply_unwanted_to

Where to apply unwanted_substrings checks. - "core": apply only to the random core sequences - "full": apply to the full flanked sequences

TYPE: str DEFAULT: 'core'

seed

Optional RNG seed for reproducibility.

TYPE: int | None DEFAULT: None

preselected_cores

Optional iterable of core sequences to draw from instead of random generation. Sampling is without replacement in random order.

TYPE: iterable[str] | None DEFAULT: None

sample_pair

sample_pair(max_tries=10000)

Generates (or reuses) a random sequence pair and returns (pair_id, pair).

Behavior
  • If preselected_cores were provided, draws from that list (random, with replacement).
  • Draw random core sequences until constraints pass.
  • Convert to canonical (sorted) flanked pair.
  • If pair was seen: return existing ID.
  • Else: assign new ID, store, return it.
PARAMETER DESCRIPTION
max_tries

Maximum attempts before raising an error (prevents infinite loops).

TYPE: int DEFAULT: 10000

RETURNS DESCRIPTION
tuple[int, tuple[str, str]]

(pair_id, (seq, rc_seq)) where seq/rc_seq are flanked and sorted.

get_pair_by_id

get_pair_by_id(pair_id)

Returns the stored pair for a given ID.

PARAMETER DESCRIPTION
pair_id

Integer ID returned by sample_pair.

TYPE: int

RETURNS DESCRIPTION
tuple[str, str]

(seq, rc_seq) canonical sorted pair.

revcom

revcom(sequence)

Computes the reverse complement of a DNA sequence.

PARAMETER DESCRIPTION
sequence

Single DNA sequence as a string.

TYPE: str

RETURNS DESCRIPTION
str

Reverse complement of the input sequence as a string.

has_four_consecutive_bases

has_four_consecutive_bases(seq)

Returns True if the sequence contains four identical consecutive bases (e.g., "GGGG", "CCCC", "AAAA", "TTTT").

Notes

Additional sequence constraints (e.g., homopolymer runs of other lengths) can be added here as needed.

PARAMETER DESCRIPTION
seq

DNA sequence as a string.

TYPE: str

RETURNS DESCRIPTION
bool

True if any base appears four times in a row, False otherwise.

sorted_key

sorted_key(seq1, seq2)

Returns a tuple with the two input sequences sorted alphabetically.

Description

Ensures that (seq1, seq2) and (seq2, seq1) map to the same dictionary key.

PARAMETER DESCRIPTION
seq1

First DNA sequence.

TYPE: str

seq2

Second DNA sequence.

TYPE: str

RETURNS DESCRIPTION
tuple

Tuple of the two sequences in alphabetical order.

create_sequence_pairs_pool

create_sequence_pairs_pool(length=7, fivep_ext='', threep_ext='', avoid_gggg=True)

Generates a list of unique DNA sequence pairs (and their reverse complements) with optional flanking sequences.

Procedure
  1. Generate all possible core sequences of specified length.
  2. Compute each sequence's reverse complement and alphabetically sort the pair.
  3. If avoid_gggg is True, filter out any pair where either sequence contains four identical bases in a row.
  4. Prepend fivep_ext and append threep_ext to both members of each pair.
  5. Enumerate the resulting list, assigning a unique integer ID to each pair.
PARAMETER DESCRIPTION
length

Length of the core DNA sequences (without flanks).

TYPE: int DEFAULT: 7

fivep_ext

Optional 5′ flanking sequence prepended to each strand.

TYPE: str DEFAULT: ''

threep_ext

Optional 3′ flanking sequence appended to each strand.

TYPE: str DEFAULT: ''

avoid_gggg

If True, filters out pairs containing four identical consecutive bases.

TYPE: bool DEFAULT: True

RETURNS DESCRIPTION
list of tuple

List of tuples [(index, (sequence, reverse_complement)), ...], where index is a unique ID and each tuple contains the complementary pair.

create_seqwalk_sequence_pairs_pool

create_seqwalk_sequence_pairs_pool(
    length=7,
    k=3,
    seed=None,
    fivep_ext="",
    threep_ext="",
    alphabet="ACGT",
    avoid_reverse_complements=True,
    gc_lims=None,
    prevented_patterns=None,
    verbose=True,
)

Generates sequence pairs from SeqWalk and converts them into this module's pair format.

This is a thin integration layer around seqwalk.design.max_size. SeqWalk designs a maximal library of core sequences for a chosen sequence-symmetry minimization (SSM) k value, optionally excluding reverse complements. The resulting core sequences are then converted into canonical (seq, revcom(seq)) pairs with optional flanks.

PARAMETER DESCRIPTION
length

Length of the core DNA sequences produced by SeqWalk.

TYPE: int DEFAULT: 7

k

Sequence symmetry minimization (SSM) k value passed to SeqWalk.

TYPE: int DEFAULT: 3

seed

Optional Python random seed for deterministic SeqWalk output.

TYPE: int | None DEFAULT: None

fivep_ext

Optional 5′ flank prepended to both strands.

TYPE: str DEFAULT: ''

threep_ext

Optional 3′ flank appended to both strands.

TYPE: str DEFAULT: ''

alphabet

Allowed DNA alphabet passed to SeqWalk.

TYPE: str DEFAULT: 'ACGT'

avoid_reverse_complements

If True, request an RC-free SeqWalk library.

TYPE: bool DEFAULT: True

gc_lims

Optional (min_gc, max_gc) tuple passed to SeqWalk.

TYPE: tuple[int, int] | None DEFAULT: None

prevented_patterns

Optional list of forbidden patterns passed to SeqWalk.

TYPE: list[str] | None DEFAULT: None

verbose

If True, allow SeqWalk to print progress information.

TYPE: bool DEFAULT: True

RETURNS DESCRIPTION
list[tuple[int, tuple[str, str]]]

List of (index, (sequence, reverse_complement)) tuples.

nupack_compute_energy_precompute_library_fast

nupack_compute_energy_precompute_library_fast(
    seq1, seq2, type="total", Use_Library=None
)

Computes the Gibbs free energy of hybridization between two DNA sequences using NUPACK, with optional caching via a precompute library.

Notes
  • Uses a local cache to avoid redundant NUPACK calls when Use_Library=True. If the argument is None, the global setting in hf.USE_LIBRARY is used.
  • Energies are stored under a sorted key so (seq1, seq2) and (seq2, seq1) map identically. This function does not write back to disk; cache updates are handled by callers.
  • Called by multiprocessing; each worker loads its own cache copy once from file.
  • Does not write to the cache during multiprocessing to prevent conflicts.
  • All energies larger than -1 kcal/mol are mapped to -1 kcal/mol. 0 is used in other routines as an indicator that the energy has not been computed. -1 kcal/mol is already extremely weak (virtually no interaction).
  • Model parameters are fixed at 37°C, sodium=0.05 M, magnesium=0.025 M; change with a fresh cache.
PARAMETER DESCRIPTION
seq1

First DNA sequence.

TYPE: str

seq2

Second DNA sequence.

TYPE: str

type

Either 'total' (partition sum) or 'minimum' (MFE) calculation. The result of 'total' is what you would use to compute a binding constant.

TYPE: str DEFAULT: 'total'

Use_Library

If True, use and load the precompute cache; defaults to global setting.

TYPE: bool | None DEFAULT: None

RETURNS DESCRIPTION
tuple[float, float, float] | float

Tuple (energy, G_A, G_B) where energy is the association free energy (kcal/mol). For homodimers, G_B == G_A. If NUPACK returns no MFE or an exception occurs, the function returns -1.0 (scalar) instead.

compute_pair_energy_on

compute_pair_energy_on(i, seq, rc_seq)

Helper function for parallel computing of on-target energies.

PARAMETER DESCRIPTION
i

Sequence index.

TYPE: int

seq

DNA sequence.

TYPE: str

rc_seq

Reverse complement sequence.

TYPE: str

RETURNS DESCRIPTION
tuple[int, float, float, float]

Tuple (i, pair_energy, self_energy_seq, self_energy_rc_seq).

compute_ontarget_energies

compute_ontarget_energies(sequence_list)

Computes on-target Gibbs free energies for a list of sequence pairs using multiprocessing.

Notes
  • Uses ProcessPoolExecutor (with initializer=_init_worker) to parallelize calls to NUPACK via nupack_compute_energy_precompute_library_fast.
  • If hf.USE_LIBRARY is True, the initializer function (_init_worker) passes the library filename and flag to each worker so that nupack_compute_energy_precompute_library_fast can load its cache. After all parallel computations finish, this function saves the cache with the new energies.
  • Saves the updated cache atomically using DelayedKeyboardInterrupt to prevent corruption.
  • Prints progress and CPU core usage to the console.
PARAMETER DESCRIPTION
sequence_list

List of tuples, each containing a sequence and its reverse complement.

TYPE: list of tuple

RETURNS DESCRIPTION
tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray]

Tuple of NumPy arrays (pair_energies, self_energies_seq, self_energies_rc_seq).

compute_pair_energy_off

compute_pair_energy_off(i, j, seq1, seq2)

Helper function for parallel computing of off-target energies.

PARAMETER DESCRIPTION
i

Index of the first sequence.

TYPE: int

j

Index of the second sequence.

TYPE: int

seq1

First DNA sequence.

TYPE: str

seq2

Second DNA sequence.

TYPE: str

RETURNS DESCRIPTION
tuple (int, int, float)

Tuple (i, j, energy) where energy is the computed Gibbs free energy.

compute_offtarget_energies

compute_offtarget_energies(sequence_pairs)

Computes off-target hybridization energies for all pairwise combinations of a given list of sequence pairs.

Procedure
  1. Extract handles and antihandles from sequence_pairs.
  2. Initialize three N×N energy matrices for:
  3. handle-handle interactions
  4. antihandle-antihandle interactions
  5. handle-antihandle interactions
  6. For each matrix, use ProcessPoolExecutor (via compute_pair_energy_off) to fill only the required entries:
  7. i ≥ j for the two symmetric matrices
  8. i ≠ j for the mixed handle-antihandle matrix
  9. If hf.USE_LIBRARY is True, the initializer function (_init_worker) passes the library filename and flag to each worker so that nupack_compute_energy_precompute_library_fast can load its cache. After all parallel computations finish, this function saves the cache with the new energies.
Notes
  • Off-target interactions are computed for:
    1) handle with handle
    2) antihandle with antihandle
    3) handle with antihandle
  • Symmetric matrices only compute the lower triangle (i ≥ j) to avoid redundancy.
  • Entries with no interaction or computation errors return -1.0 (mapped for any energy > -1.0).
    A value of 0 indicates the energy was skipped due to redundancy.
  • Uses DelayedKeyboardInterrupt to ensure atomic writes when saving the updated cache.
PARAMETER DESCRIPTION
sequence_pairs

List of (sequence, reverse_complement) tuples.

TYPE: list of tuple

RETURNS DESCRIPTION
dict

Dictionary containing three N×N numpy arrays with keys:
- 'handle_handle_energies'
- 'antihandle_handle_energies'
- 'antihandle_antihandle_energies'

select_subset

select_subset(sequence_pairs, max_size=200, timeout_s=20)

Selects a random subset of sequence pairs up to a specified maximum size.

This function supports two input types: 1) A precomputed pool: list of (index, (seq, rc_seq)) tuples. - If pool size > max_size: uses random.sample for efficiency. - Else: returns all pairs. 2) A generator/registry object that provides sample_pair(). - Repeatedly calls sample_pair() until max_size unique pairs are collected, or timeout_s is reached.

Notes
  • For list input: uses sampling rather than shuffling for performance.
  • For registry input: guarantees uniqueness by ID (not by sequence string), so repeated samples do not inflate the subset.
Timeout behavior

If timeout_s is reached while using a registry, the function returns the pairs found so far and prints: "Only X of requested Y found (timeout)."

PARAMETER DESCRIPTION
sequence_pairs

Either - list of (index, (seq, rc_seq)) tuples, or - an object with method sample_pair() -> (pair_id, (seq, rc_seq)).

TYPE: list | object

max_size

Maximum number of pairs to select.

TYPE: int DEFAULT: 200

timeout_s

Optional timeout in seconds (only used for registry input).

TYPE: float | None DEFAULT: 20

RETURNS DESCRIPTION
list of tuple

List of (seq, rc_seq) pairs selected.

crossreference_sequences

crossreference_sequences(
    new_pair, pool, offtarget_limit, max_pair_violations=0, Use_Library=None
)

Checks off-target interactions between a candidate sequence pair and a history pool.

Counts violations per pool pair, not per individual strand-strand interaction. A pool pair is counted as violating if any of the four pairwise comparisons between (seq, rc_seq) and (pool_seq, pool_rc) falls below offtarget_limit.

PARAMETER DESCRIPTION
new_pair

Candidate (seq, rc_seq) pair to test.

TYPE: tuple[str, str]

pool

Existing (seq, rc_seq) pairs to cross-reference against.

TYPE: list[tuple[str, str]]

offtarget_limit

Energy cutoff below which an off-target interaction is considered a violation.

TYPE: float

max_pair_violations

Maximum number of violating pool pairs allowed before the candidate is rejected.

TYPE: int DEFAULT: 0

Use_Library

Whether to use the precomputed energy library (overrides the global setting if not None).

TYPE: bool | None DEFAULT: None

RETURNS DESCRIPTION
tuple[bool, int]

Tuple (passed, nupack_calls) where passed is False if the number of violating pool pairs exceeds max_pair_violations, and nupack_calls is the number of direct energy computations performed during this cross-reference check.

select_subset_in_energy_range

select_subset_in_energy_range(
    sequence_pairs,
    energy_min=-inf,
    energy_max=inf,
    self_energy_min=-inf,
    max_size=inf,
    Use_Library=None,
    avoid_indices=None,
    timeout_s=None,
    history_pool=None,
    allowed_violations=0,
    offtarget_limit=None,
    max_nupack_calls=None,
    progress_every=None,
)

Selects a random subset of sequence pairs that pass on-target energy, self-energy, and optional cross-reference filters.

Supports two input types: 1) Precomputed list of (index, (seq, rc_seq)) tuples. 2) SequencePairRegistry-like object with sample_pair() method.

Notes
  • Uses random sampling without full shuffling.
  • Keeps returned sequence order aligned with returned indices list.
  • Can stop early due to timeout_s, max_nupack_calls, or candidate exhaustion.
  • If offtarget_limit is None, cross-reference filtering is skipped.
PARAMETER DESCRIPTION
sequence_pairs

List of (index, (seq, rc_seq)) tuples or registry with sample_pair().

TYPE: list | object

energy_min

Minimum acceptable on-target (association) energy.

TYPE: float DEFAULT: -inf

energy_max

Maximum acceptable on-target (association) energy.

TYPE: float DEFAULT: inf

self_energy_min

Minimum acceptable self-energy for each strand.

TYPE: float DEFAULT: -inf

max_size

Maximum number of pairs to return.

TYPE: int DEFAULT: inf

Use_Library

Whether to use the precomputed energy library (overrides global if not None).

TYPE: bool | None DEFAULT: None

avoid_indices

Indices to avoid when sampling.

TYPE: set | None DEFAULT: None

timeout_s

Optional wall-clock timeout in seconds; returns early if exceeded.

TYPE: float | None DEFAULT: None

history_pool

Optional list of accepted (seq, rc_seq) pairs to cross-reference against.

TYPE: list[tuple[str, str]] | None DEFAULT: None

allowed_violations

Maximum number of pool pairs allowed to violate offtarget_limit.

TYPE: int DEFAULT: 0

offtarget_limit

Optional off-target energy cutoff for cross-reference filtering.

TYPE: float | None DEFAULT: None

max_nupack_calls

Optional limit on direct NUPACK energy computations made inside this function.

TYPE: int | None DEFAULT: None

progress_every

Optional attempt interval for progress prints.

TYPE: int | None DEFAULT: None

RETURNS DESCRIPTION
tuple[list[tuple[str, str]], list[int], bool, int]

Tuple (subset, indices, stopped_early, nupack_calls) where subset is a list of (seq, rc_seq) pairs, indices are their corresponding global IDs, stopped_early indicates timeout or NUPACK-budget exit, and nupack_calls is the number of direct NUPACK computations made inside this function.

select_all_in_energy_range

select_all_in_energy_range(
    sequence_pairs, energy_min=-inf, energy_max=inf, Use_Library=None, avoid_ids=None
)

Selects all sequence pairs whose on-target energies fall within a given energy range.

Description

Iterates through every (global_index, (seq, rc_seq)) tuple, computes the on-target energy using nupack_compute_energy_precompute_library_fast, and collects those where energy_min <= energy <= energy_max, skipping any global_index values in avoid_ids. Note that the ID here refers to the global index in the original sequence-pair list.

Notes
  • If Use_Library is True, energies are fetched from or stored in the precompute cache.
  • Prints progress messages to the console.
PARAMETER DESCRIPTION
sequence_pairs

List of (global_index, (seq, rc_seq)) tuples.

TYPE: list of tuple

energy_min

Minimum allowed Gibbs free energy (inclusive).

TYPE: float DEFAULT: -inf

energy_max

Maximum allowed Gibbs free energy (inclusive).

TYPE: float DEFAULT: inf

Use_Library

Whether to use a precomputed energy library (overrides global if not None).

TYPE: bool | None DEFAULT: None

avoid_ids

Set of global indices to skip during selection.

TYPE: set | None DEFAULT: None

RETURNS DESCRIPTION
tuple (list of tuple, list of int)

Tuple (subset, selected_ids) where: - subset is a list of (seq, rc_seq) pairs within the energy range. - selected_ids is a list of their corresponding global indices.

compute_offtarget_fraction_below_limit

compute_offtarget_fraction_below_limit(off_energies, off_limit)

Computes the fraction of off-target energies that are below off_limit.

Notes
  • If off_energies is a dict of matrices, values are flattened and concatenated.
  • For dict input, zeros are excluded because they represent uncomputed entries.
PARAMETER DESCRIPTION
off_energies

Off-target energies as an array-like or dict of energy matrices.

TYPE: array - like | dict

off_limit

Threshold energy (kcal/mol).

TYPE: float

RETURNS DESCRIPTION
float

Fraction of values < off_limit in [0, 1]. Returns 0.0 if no values are available.

plot_on_off_target_histograms

plot_on_off_target_histograms(
    on_energies,
    off_energies,
    bins=80,
    output_path=None,
    show_plot=True,
    vlines=None,
    title=None,
    xlim=None,
)

Plots histograms comparing on-target and off-target Gibbs free energy distributions.

Notes
  • If off_energies is a dict, combines:
    • 'handle_handle_energies'
    • 'antihandle_handle_energies'
    • 'antihandle_antihandle_energies' into a single array, excluding zeros (uncomputed values).
  • Normalizes frequencies so that area under each histogram sums to 1.
  • Uses consistent bin edges across both distributions for direct comparison.
  • Saves the figure to output_path if provided, otherwise only displays it.
  • Prints summary statistics after plotting.
PARAMETER DESCRIPTION
on_energies

On-target energy values.

TYPE: array - like

off_energies

Off-target energies as an array-like or dict of energy matrices.

TYPE: array - like | dict

bins

Number of bins for histograms.

TYPE: int DEFAULT: 80

output_path

File path to save the plot; if None, the plot is only displayed.

TYPE: str | None DEFAULT: None

show_plot

Whether to call plt.show() to display the plot.

TYPE: bool DEFAULT: True

vlines

Optional dictionary of additional vertical lines to draw. Special keys: 'min_ontarget'.

TYPE: dict | None DEFAULT: None

title

Optional custom plot title. If None, a default title is used.

TYPE: str | None DEFAULT: None

xlim

Optional x-axis limits as (xmin, xmax). If None, limits are inferred from data.

TYPE: tuple[float, float] | None DEFAULT: None

RETURNS DESCRIPTION
dict

Dictionary of summary statistics: - 'min_on' : Minimum on-target energy - 'mean_on' : Mean of on-target energies
- 'std_on' : Standard deviation of on-target energies
- 'max_on' : Maximum on-target energy
- 'mean_off' : Mean of off-target energies
- 'std_off' : Standard deviation of off-target energies
- 'min_off' : Minimum off-target energy

plot_self_energy_histogram

plot_self_energy_histogram(self_energies, bins=30, output_path=None, show_plot=True)

Plots a histogram of self-energies (e.g., G_A and G_B combined).

Notes
  • Accepts a single array-like, a tuple/list of arrays (e.g., (G_A, G_B)), or a dict of arrays; all values are flattened and concatenated.
  • Uses the same visual style as plot_on_off_target_histograms.
  • Prints summary statistics after plotting.
PARAMETER DESCRIPTION
self_energies

Array-like, tuple/list of arrays, or dict of arrays.

TYPE: array - like | tuple / list | dict

bins

Number of bins for histogram.

TYPE: int DEFAULT: 30

output_path

File path to save the plot; if None, the plot is only displayed.

TYPE: str | None DEFAULT: None

show_plot

Whether to call plt.show() to display the plot.

TYPE: bool DEFAULT: True

Vertex Cover Algorithms

orthoseq_generator.vertex_cover_algorithms

min_ontarget module-attribute

min_ontarget = -10.4

Select sequences with on-target energy in desired range

subset, indices, _, _ = select_subset_in_energy_range( ontarget7mer, energy_min=min_ontarget, energy_max=max_ontarget, max_size=30, Use_Library=True, avoid_indices=set() )

Compute off-target energies for the subset

off_e_subset = compute_offtarget_energies(subset, Use_Library=False)

Build the off-target interaction graph

Edges = build_edges(off_e_subset, indices, offtarget_limit)

heuristic_vertex_cover_optimized2

heuristic_vertex_cover_optimized2(E, avoid_V=None, cleanup=True)

This function is the core of the sequence search algorithm. It’s a heuristic approach to solve the NP-hard minimum vertex cover problem.

Inspired by: - Joshi (2020), "Neighbourhood Evaluation Criteria for Vertex Cover Problem" - StackExchange discussion: https://cs.stackexchange.com/q/74546

Algorithm Outline
  1. Immediately add any self-edge vertices (u == v) to the cover.
  2. Build an adjacency list for all non-self edges.
  3. Track the degree (number of neighbors) for each vertex.
  4. While edges remain: a. Identify the vertex/vertices with maximum degree. b. Among those, select the vertex with the fewest neighbors that also share that max degree. c. Break ties randomly, preferring vertices in avoid_V. d. Add the selected vertex to the cover, remove it and its incident edges, and update degrees.
Notes
  • avoid_V contains vertices that should be removed when possible, but they can still be kept.
  • Self-edges are covered immediately.
  • Orphan vertices (degree zero) are naturally independent and never need removal.
PARAMETER DESCRIPTION
E

Set of edges (u, v). Vertices can be any hashable.

TYPE: iterable of tuple

avoid_V

Vertices you’d like to preferentially remove into the cover. They can still be kept, just less likely.

TYPE: (set, optional) DEFAULT: None

cleanup

If True, remove any redundant vertices from the final cover without uncovering any edges.

TYPE: bool DEFAULT: True

RETURNS DESCRIPTION
set

A vertex cover (set of vertices touching every edge in E).

find_uncovered_edges

find_uncovered_edges(E, vertex_cover)

Finds edges that are not covered by the current vertex cover.

Description

Given a collection of edges E and a set vertex_cover of vertices, this function returns all edges which are not in the set. Technically, vertex_cover is not a full vertex cover of the graph but only a partial vertex cover.

PARAMETER DESCRIPTION
E

Collection of edges (u, v).

TYPE: iterable of tuple

vertex_cover

Set of vertices currently in the cover.

TYPE: set

RETURNS DESCRIPTION
set

Edges (u, v) from E for which neither u nor v is in vertex_cover. Self-edges (u == v) are included if u is not in the cover.

build_edges

build_edges(offtarget_dict, indices, energy_cutoff)

Builds a list of global index‐pair edges from off‐target energy matrices. (Global indices refer to the positions in the originally created sequence-pair list.)

Procedure
  1. Extract all (i, j) positions from each matrix where energy < energy_cutoff.
  2. Stack these positions together and sort each pair so (i, j) and (j, i) collapse to one.
  3. Remove duplicate pairs.
  4. Map local indices back to global sequence indices via the indices list.
PARAMETER DESCRIPTION
offtarget_dict

Dictionary containing three N×N numpy arrays under keys: - 'handle_handle_energies' - 'antihandle_handle_energies' - 'antihandle_antihandle_energies'

TYPE: dict

indices

List of global sequence indices corresponding to matrix rows/columns.

TYPE: list of int

energy_cutoff

Threshold below which an energy defines an edge.

TYPE: float

RETURNS DESCRIPTION
list of tuple

List of (i, j) tuples where each is a global‐index edge with off‐target energy < cutoff.

compute_pair_conflict_probability

compute_pair_conflict_probability(offtarget_dict, energy_cutoff)

Computes pair-level conflict probability using the same conflict rule as build_edges.

A pair (i, j) with i != j is counted as conflicting if at least one of the three off-target interaction matrices violates energy_cutoff, exactly as in build_edges.

PARAMETER DESCRIPTION
offtarget_dict

Dictionary containing the three off-target energy matrices.

TYPE: dict

energy_cutoff

Threshold below which an interaction defines a conflict.

TYPE: float

RETURNS DESCRIPTION
float

Fraction of conflicting unordered sequence-pair pairs in [0, 1]. Returns 0.0 if fewer than 2 sequence pairs are present.

select_vertices_to_remove

select_vertices_to_remove(vertex_cover, num_vertices_to_remove)

Selects a subset of vertices to remove from an existing vertex cover.

PARAMETER DESCRIPTION
vertex_cover

Current set of cover vertices.

TYPE: set

num_vertices_to_remove

Desired number of vertices to remove.

TYPE: int

RETURNS DESCRIPTION
set

Randomly chosen vertices to remove (size ≤ num_vertices_to_remove).

iterative_vertex_cover_multi

iterative_vertex_cover_multi(
    V,
    E,
    avoid_V=None,
    num_vertices_to_remove=150,
    max_iterations=200,
    limit=+inf,
    multistart=30,
    population_size=5,
    show_progress=False,
)

Attempts to find a small vertex cover via multiple randomized restarts and iterative refinement. Strategically calls heuristic_vertex_cover_optimized2

Algorithm Outline
  1. For each of multistart attempts: a. Compute an initial cover via the greedy heuristic. b. Initialize a population containing that cover. c. Repeat up to max_iterations:
    • For each cover in the population:
      • Remove num_vertices_to_remove random vertices (respecting avoid_V).
      • Find uncovered edges and re-cover via the heuristic.
      • If the new cover is smaller, reset the population to this cover.
      • If it’s the same size but unique, add it to the population.
    • Trim the population to population_size by random sampling.
    • Optionally print progress. d. If this attempt’s best cover is smaller than the global best, update it.
Notes

Because minimum vertex cover is NP-hard, this is a heuristic: it runs quickly but does not guarantee an optimal solution.

PARAMETER DESCRIPTION
V

All vertices in the graph (e.g., list or set of IDs). Note: V is only used for printing/monitoring; the graph is fully encoded by E.

TYPE: iterable

E

All edges (u, v) in global index space.

TYPE: iterable of tuple

avoid_V

Vertices to preferentially remove into the cover.

TYPE: (set, optional) DEFAULT: None

num_vertices_to_remove

Number of vertices to drop each iteration.

TYPE: int DEFAULT: 150

max_iterations

Max refine steps per restart.

TYPE: int DEFAULT: 200

limit

Target threshold for |V| - |cover|; stops early if reached.

TYPE: float DEFAULT: +inf

multistart

Number of independent greedy restarts.

TYPE: int DEFAULT: 30

population_size

Max number of equal-sized covers to retain each iteration.

TYPE: int DEFAULT: 5

show_progress

If True, prints status each iteration.

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
tuple[set, list[list[int]]]

Tuple of (best_vertex_cover, trajectories), where trajectories is a list of per-multistart lists of independent set sizes over iterations.

evolutionary_vertex_cover

evolutionary_vertex_cover(
    sequence_pairs,
    offtarget_limit,
    max_ontarget,
    min_ontarget,
    self_energy_limit,
    subsetsize=200,
    generations=100,
    stop_event=None,
)

Dont use. It is working worse than Evolves an independent set of sequences from a set of candidate sequence pairs by iteratively removing high-energy (off-target) interactions via vertex-cover heuristics.
Implements a form of genetic “survivor selection” via repeated vertex-cover: new sequences are sampled each generation and those with strong off-target interactions are “removed” again. The history variable ensures previously promising sequences re-enter the sampling pool.

Procedure
  1. Initialize:
  2. non_cover_vertices: best independent set so far (sequences not in the cover).
  3. history: indices to avoid reselection, preserving diversity.
  4. For each of generations iterations: a. Check if stop_event is set. If so, break. b. Select a random subset of sequences whose on-target energies lie within [min_ontarget, max_ontarget], excluding those in history.
    c. Re-add any sequences from history to ensure good candidates are retained.
    d. Assert that there are no duplicate indices.
    e. Compute off-target energies for the subset.
    f. Build the off-target interaction graph (edges where energy < offtarget_limit).
    g. Apply the multi-start, iterative vertex-cover heuristic to find removed_vertices.
    h. Derive the new independent set: all selected indices minus removed_vertices.
    i. If this independent set is at least as large as the previous best:
    • Update non_cover_vertices.
    • Clear history if strictly larger.
      j. If its size ≥ 95% of the best, add its indices (deduplicated) to history.
      k. Print generation summary statistics.
  5. On user interrupt (Ctrl+C) or stop_event, exit gracefully and proceed to save the current best.
  6. After all generations or interruption, save the final independent set to a text file.
Notes
  • Catches KeyboardInterrupt to allow early exit: the best result so far is saved and plotted.
PARAMETER DESCRIPTION
sequence_pairs

List of (index, (seq, rc_seq)) tuples for candidate sequences.

TYPE: list of tuple

offtarget_limit

Energy threshold below which an off-target interaction defines an edge.

TYPE: float

max_ontarget

Upper bound for acceptable on-target energy.

TYPE: float

min_ontarget

Lower bound for acceptable on-target energy.

TYPE: float

self_energy_limit

Minimum acceptable self-energy for each strand.

TYPE: float

subsetsize

Number of sequences to sample per generation.

TYPE: int DEFAULT: 200

generations

Number of evolutionary iterations to perform.

TYPE: int DEFAULT: 100

stop_event

Optional threading.Event to stop the search.

TYPE: Event DEFAULT: None

RETURNS DESCRIPTION
list of tuple

Final list of (seq, rc_seq) pairs forming the best independent set.

Helper Functions

orthoseq_generator.helper_functions

DelayedKeyboardInterrupt

Context manager that delays KeyboardInterrupt (Ctrl+C) during critical operations.

This prevents corruption of the precomputed energy library by deferring interrupt handling until the protected block (e.g., file writes) completes.

Usage

with DelayedKeyboardInterrupt(): # perform critical operation, like saving files save_pickle_atomic(...)

Notes
  • On entering, replaces the SIGINT handler to queue the signal.
  • On exit, restores the original handler and re-raises if an interrupt was received.

set_nupack_params

set_nupack_params(material='dna', celsius=37, sodium=0.05, magnesium=0.025)

Updates global NUPACK parameters used for all energy computations.

Notes

These values are read by functions in sequence_computations when building a NUPACK Model. If you change parameters, you should also choose a new precompute library filename to avoid mixing incompatible energies.

PARAMETER DESCRIPTION
material

NUPACK material type (e.g., "dna").

TYPE: str DEFAULT: 'dna'

celsius

Temperature in Celsius.

TYPE: float DEFAULT: 37

sodium

Sodium concentration in M.

TYPE: float DEFAULT: 0.05

magnesium

Magnesium concentration in M.

TYPE: float DEFAULT: 0.025

RETURNS DESCRIPTION
None

None

choose_precompute_library

choose_precompute_library(filename)

Sets the name of the precomputed energy library file.

Notes

Updates the global variable used by other functions to locate the correct library.

PARAMETER DESCRIPTION
filename

Name of the pickle file where precomputed energies are or will be stored.

TYPE: str

RETURNS DESCRIPTION
None

None

save_pickle_atomic

save_pickle_atomic(data, filepath)

Saves a Python object to disk as a pickle file in a safe and atomic way.

Notes
  • Writes data to a temporary file (<filepath>.tmp) first, then atomically replaces the original file to avoid corruption if a crash occurs during writing.
  • Creates the target directory if it does not exist.
PARAMETER DESCRIPTION
data

Python object to save (typically a dictionary).

TYPE: any

filepath

Full path to the target pickle file.

TYPE: str

RETURNS DESCRIPTION
None

None

get_library_path

get_library_path()

Returns the full file path to the currently selected precomputed energy library.

Description

Constructs a path by combining the 'pre_computed_energies' folder with the filename set via choose_precompute_library(). If no filename has been set, defaults to 'test_lib.pkl'.

RETURNS DESCRIPTION
str

Full path to the pickle file containing the precomputed Gibbs free energy dictionary.

get_default_results_folder

get_default_results_folder()

Returns the default path to the 'noflank_results' folder where output files containing the generated sequence pairs are saved.

Description

The noflank_results directory is created automatically if it does not exist. The path is based on the current working directory from which the script was executed.

RETURNS DESCRIPTION
str

Absolute path to the 'noflank_results' directory.

save_sequence_pairs_to_txt

save_sequence_pairs_to_txt(sequence_pairs, filename=None)

Saves a list of DNA sequence pairs to a plain text file in the default noflank_results folder.

Description

Each line in the output file contains a sequence and its reverse complement, separated by a tab. If filename is not provided, an informative name is generated based on the number of sequences, sequence length, and current timestamp.

PARAMETER DESCRIPTION
sequence_pairs

List of (sequence, reverse_complement) tuples.

TYPE: list of tuple

filename

Optional custom file name. If None, a name is generated based on timestamp and sequence length.

TYPE: str | None DEFAULT: None

RETURNS DESCRIPTION
None

None

load_sequence_pairs_from_txt

load_sequence_pairs_from_txt(filename, use_default_results_folder=True)

Loads DNA sequence pairs from a plain text file in the default noflank_results folder.

Description

Reads a tab-separated text file where each line contains a sequence and its reverse complement. The file is located in the noflank_results directory returned by get_default_results_folder().

PARAMETER DESCRIPTION
filename

Name of the text file to load.

TYPE: str

use_default_results_folder

If True, interpret filename relative to the default noflank_results folder; otherwise treat it as an absolute or relative path.

TYPE: bool DEFAULT: True

RETURNS DESCRIPTION
list of tuple

List of (sequence, reverse_complement) tuples loaded from the file.

RAISES DESCRIPTION
FileNotFoundError

If the specified file does not exist.

eqcorr2d Module

The C API is documented here, while the python modules are available below.

eqcorr2d.eqcorr2d_interface

High-level Python interface around the eqcorr2d C engine.

This module provides a thin but well-documented wrapper that: - Accepts dictionaries of binary handle/antihandle occupancy arrays (1D or 2D) keyed by user-facing identifiers. - Orchestrates optional geometric rotations in Python (0/90/180/270 for square lattices; 0/60/120/180/240/300 for triangular lattices) by pre-rotating the antihandle arrays before delegating to the C engine. The C core always computes for a fixed orientation; we rotate inputs instead of changing the core. - Aggregates per-rotation outputs into a single, stable result dictionary that is easier to consume than the legacy tuple.

Key terms: - handle_dict: dict[key -> np.ndarray] of uint8 with shape (H, W) or (L,) for 1D slats. Non-zero entries indicate occupied positions. - antihandle_dict: same as handle_dict, but for the opposing set. - "matchtype": an integer bin used by the C engine to bucket similarity counts. Larger values typically represent worse similarity.

Modes: - classic: only 0° and 180° rotations (historical behavior for 1D slats). - square_grid: 0°, 90°, 180°, 270°. - triangle_grid: 0°, 60°, 120°, 180°, 240°, 300° (implemented via rotate_array_tri60).

Smart mode (do_smart): - If enabled, we still compute 0°/180°. - For square_grid, 90°/270° are only computed when at least one side of a pair is truly 2D (H >= 2 and W >= 2). This keeps compute costs lower for pure 1D data. - For triangle_grid, the same idea applies to the six-fold rotation set.

Note: This module adds extensive comments and docstrings only. The C code is not modified by this interface.

comprehensive_score_analysis

comprehensive_score_analysis(
    handle_dict,
    antihandle_dict,
    match_counts,
    connection_graph,
    connection_angle,
    do_worst=False,
    fudge_dg=10,
    request_similarity_score=True,
)

Compute all relevant match count metrics for a megastructure's slat handles.

When provided with a megastructure's slat handles/antihandles and connection graphs, this function computes all relevant match count metrics including worst match, mean log score, similarity score, and a compensated match histogram.

PARAMETER DESCRIPTION
handle_dict

Dictionary mapping slat identifiers to handle arrays (can be 1D or 2D uint8 arrays).

TYPE: dict[Any, ndarray]

antihandle_dict

Dictionary mapping slat identifiers to antihandle arrays (can be 1D or 2D uint8 arrays).

TYPE: dict[Any, ndarray]

match_counts

Dictionary with counts of expected matches due to connections between slats. Keys are match counts (int), values are the number of such matches expected.

TYPE: dict[int, int]

connection_graph

Dictionary mapping match types to lists of (handle_key, antihandle_key) pairs that are expected to match due to connections.

TYPE: dict[int, list[tuple]]

connection_angle

Either '60' or '90' indicating the connection geometry (triangular or square grid).

TYPE: str

do_worst

If True, return which slat pairs contributed to the worst match score. Defaults to False.

TYPE: bool DEFAULT: False

fudge_dg

Fudge factor for mean log score computation. Defaults to 10.

TYPE: float DEFAULT: 10

request_similarity_score

If True, computes a similarity score to check for slat duplication risk. Defaults to True.

TYPE: bool DEFAULT: True

RETURNS DESCRIPTION
dict

Dictionary containing:

  • worst_match_score (int): The highest non-zero match count in the compensated histogram.
  • mean_log_score (float): Logarithmic weighted score normalized by number of pairs.
  • match_histogram (numpy.ndarray): Compensated histogram of match counts.
  • uncompensated_match_histogram (numpy.ndarray): Raw histogram before connection compensation.
  • similarity_score (int, optional): Highest match count within handle/antihandle sets (if requested).
  • worst_slat_combos (list, optional): List of (handle_key, antihandle_key, count) tuples (if do_worst=True).

compensate_histogram

compensate_histogram(hist, connection_hist)

Subtract the connection occupancy from the total histogram safely.

Handles unequal lengths by padding the shorter array with zeros.

PARAMETER DESCRIPTION
hist

The total match histogram to compensate.

TYPE: ndarray

connection_hist

Histogram of expected matches due to slat connections.

TYPE: ndarray

RETURNS DESCRIPTION
numpy.ndarray

Compensated histogram with connection matches subtracted.

RAISES DESCRIPTION
ValueError

If subtraction noflank_results in negative values (over-subtraction).

make_connection_histogram

make_connection_histogram(connection_graph)

Create a histogram from a connection graph.

PARAMETER DESCRIPTION
connection_graph

Dictionary mapping match types (int) to lists of connection pairs.

TYPE: dict[int, list]

RETURNS DESCRIPTION
numpy.ndarray

Histogram where each index corresponds to a match type and the value is the number of connections for that type.

wrap_eqcorr2d

wrap_eqcorr2d(
    handle_dict,
    antihandle_dict,
    mode="classic",
    hist=True,
    local_histogram=False,
    report_full=False,
    do_smart=False,
)

Run eqcorr2d on all handle/antihandle pairs, optionally across rotations.

This function is the preferred high-level entry point. It accepts two dictionaries mapping arbitrary keys (e.g., slat ids) to binary occupancy arrays, prepares them for the low-level C engine, optionally pre-rotates the antihandles for the requested angle set, and then aggregates all outputs into a single, well-structured result dictionary.

PARAMETER DESCRIPTION
handle_dict

Dictionary mapping keys to binary arrays (uint8), either 1D with shape (L,) or 2D with shape (H, W). Non-zeros mark occupied positions. Each array is converted to C-contiguous uint8 and reshaped to (1, L) for 1D inputs.

TYPE: dict[Any, ndarray]

antihandle_dict

Same rules as handle_dict, but for the opposing handle set.

TYPE: dict[Any, ndarray]

mode

Rotation mode determining which angles are computed:

  • 'classic': [0, 180] degrees only
  • 'square_grid': [0, 90, 180, 270] degrees
  • 'triangle_grid': [0, 60, 120, 180, 240, 300] degrees

Defaults to 'classic'.

TYPE: str DEFAULT: 'classic'

hist

If True, request histogram accumulation from the C engine. The top-level 'hist_total' returned is the sum across all considered rotations. Defaults to True.

TYPE: bool DEFAULT: True

local_histogram

If True, compute per-pair histograms in addition to the global histogram. Results stored in 'local_hist_total'. Defaults to False.

TYPE: bool DEFAULT: False

report_full

If True, per-rotation raw outputs are included under result['rotations'][angle]['full']. Defaults to False.

TYPE: bool DEFAULT: False

do_smart

Heuristic compute saver. For square/triangle grids, 90°/270° (and the non-axial 60° steps) are only evaluated for pairs where at least one operand is truly 2D (H >= 2 and W >= 2). 0°/180° are always evaluated. Defaults to False.

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
dict .. note:: The low-level C engine always computes a single orientation. Rotations are handled by pre-rotating the antihandle arrays before calling the C engine. For triangle_grid, rotations are performed via :func:`rotate_array_tri60`.

Dictionary containing:

  • angles (list[int]): Angles actually computed.
  • hist_total (numpy.ndarray or None): Summed histogram if hist=True.
  • rotations (dict): Per-rotation data (if report_full=True).
  • handle_keys (list): Keys from handle_dict in order.
  • anti_handle_keys (list): Keys from antihandle_dict in order.
  • local_hist_total (numpy.ndarray or None): 3D array (nA, nB, L) if local_histogram=True.

get_worst_match

get_worst_match(c_results)

Return the worst (highest non-zero) matchtype from a result.

PARAMETER DESCRIPTION
c_results

Result dictionary from :func:wrap_eqcorr2d containing 'hist_total'.

TYPE: dict

RETURNS DESCRIPTION
int | None

The highest non-zero bin index in the histogram, or None if histogram is empty.

get_sum_score

get_sum_score(c_results, fudge_dg=10)

Compute an exponentially weighted sum score from histogram.

The score is computed as: sum(count * exp(fudge_dg * matchtype)) for all bins.

PARAMETER DESCRIPTION
c_results

Result dictionary from :func:wrap_eqcorr2d containing 'hist_total'.

TYPE: dict

fudge_dg

Exponential weighting factor. Higher values penalize high match counts more. Defaults to 10.

TYPE: float DEFAULT: 10

RETURNS DESCRIPTION
float

Weighted sum score.

get_seperate_worst_lists

get_seperate_worst_lists(c_results)

Return separate lists of worst handle and antihandle identifiers.

Extracts the handle and antihandle keys that contributed to the worst (highest) match count bin.

PARAMETER DESCRIPTION
c_results

Result dictionary from :func:wrap_eqcorr2d containing 'hist_total', 'handle_keys', 'anti_handle_keys', and 'local_hist_total'.

TYPE: dict

RETURNS DESCRIPTION
tuple[list, list] | tuple[None, None]

Tuple of (handle_list, antihandle_list) where each list contains the keys that contributed to the worst match bin. Returns (None, None) if required data is missing.

get_worst_keys_combos

get_worst_keys_combos(c_results)

Return a list of key pairs that contributed to the global worst histogram bin.

Relies on the 'local_hist_total' 3D array of shape (nA, nB, L) and the key lists.

PARAMETER DESCRIPTION
c_results

Result dictionary from :func:wrap_eqcorr2d containing 'hist_total', 'handle_keys', 'anti_handle_keys', and 'local_hist_total'.

TYPE: dict

RETURNS DESCRIPTION
list[tuple] | None

List of (handle_key, antihandle_key, count) tuples for pairs that contributed to the worst match bin. Returns None if required data is missing.

get_compensated_worst_keys_combos

get_compensated_worst_keys_combos(c_results, connection_graph)

Return worst key pairs after compensating for expected connections.

Similar to :func:get_worst_keys_combos, but subtracts expected matches from the connection graph to identify truly problematic pairs.

PARAMETER DESCRIPTION
c_results

Result dictionary from :func:wrap_eqcorr2d containing 'hist_total', 'handle_keys', 'anti_handle_keys', and 'local_hist_total'.

TYPE: dict

connection_graph

Dictionary mapping match types to lists of expected (handle_key, antihandle_key) connection pairs.

TYPE: dict[int, list[tuple]]

RETURNS DESCRIPTION
list[tuple]

List of (handle_key, antihandle_key, adjusted_count) tuples for pairs that contributed to the worst match bin after compensation.

RAISES DESCRIPTION
ValueError

If over-subtraction is detected (more skips than actual matches).

RuntimeError

If expected skips don't match found pairs.

get_similarity_hist

get_similarity_hist(handle_dict, antihandle_dict, mode='square_grid', do_smart=True)

Build a library-level similarity histogram (handles+antihandles).

This helper runs :func:wrap_eqcorr2d twice, once within the handle set and once within the antihandle set, then sums the resulting histograms. Finally it subtracts a simple self-match correction so that exact self-pairs do not inflate the counts.

PARAMETER DESCRIPTION
handle_dict

Dictionary mapping keys to handle arrays.

TYPE: dict[Any, ndarray]

antihandle_dict

Dictionary mapping keys to antihandle arrays.

TYPE: dict[Any, ndarray]

mode

Rotation mode for comparisons. One of 'classic', 'square_grid', or 'triangle_grid'. Defaults to 'square_grid'.

TYPE: str DEFAULT: 'square_grid'

do_smart

If True, skip unnecessary rotation computations for 1D arrays. Defaults to True.

TYPE: bool DEFAULT: True

RETURNS DESCRIPTION
dict .. note:: The self-match correction subtracts one count at matchtype = number of nonzeros for each individual array.

Dictionary containing:

  • hist_total (numpy.ndarray): Combined similarity histogram with self-match correction.
  • angles (list): Empty list (no per-rotation info for this helper).
  • rotations (dict): Empty dict.
  • worst_keys_combos (None): Not computed for similarity analysis.

eqcorr2d.slat_standardized_mapping

convert_to_triangular

convert_to_triangular(coord)

Convert Cartesian coordinates to triangular lattice coordinates.

Applies the transformation: x(new) = (x+y)/2, y(new) = -x

PARAMETER DESCRIPTION
coord

Tuple of (y, x) Cartesian coordinates.

TYPE: tuple[int, int]

RETURNS DESCRIPTION
tuple[int, int]

Tuple of (y_new, x_new) triangular coordinates.

convert_triangular_coords_to_array

convert_triangular_coords_to_array(coords)

Convert a list of triangular coordinates into a numpy array.

Creates a 2D array where each coordinate position contains its 1-based index. Coordinates are shifted so the minimum values become 0.

PARAMETER DESCRIPTION
coords

List of (x, y) coordinate tuples in triangular space.

TYPE: list[tuple[int, int]]

RETURNS DESCRIPTION
numpy.ndarray

2D array with coordinate positions marked by their 1-based indices.

generate_standardized_slat_handle_array

generate_standardized_slat_handle_array(slat_1D_array, slat_type)

Map slat handles to a standardized 2D shape for match calculations.

Given a list of slat handles in order, assigns handles to their corresponding standardized slat shape, which can then be used downstream in handle match valency calculations.

PARAMETER DESCRIPTION
slat_1D_array

1D numpy array containing slat handle values in order.

TYPE: ndarray

slat_type

Slat type identifier (e.g., 'DB-L-120', 'DB-L-60', 'DB-R-60', 'DB-R-120'). Must be a key in :data:standardized_slat_mappings.

TYPE: str

RETURNS DESCRIPTION
numpy.ndarray

2D numpy array containing slat handles arranged in standardized shape.

eqcorr2d.rot60

rotate_coords_tri60

rotate_coords_tri60(i: ndarray, j: ndarray, k: int = 1)

Rotate lattice coordinates by k×60° on a triangular (axial) grid.

This implements the closed-form rotations for the axial coordinate system commonly used with triangular/hexagonal grids. It is fully vectorized and supports broadcasting; i and j can be scalars or arrays of the same shape.

The six distinct rotations (k mod 6) are:

  • R: (i, j) -> (-j, i + j)
  • R^2: (i, j) -> (-(i+j), i)
  • R^3: (i, j) -> (-i, -j)
  • R^4: (i, j) -> (j, -i - j)
  • R^5: (i, j) -> (i + j, -i)
  • R^0: identity (i, j)
PARAMETER DESCRIPTION
i

Row indices in the axial coordinate system. Can be scalar or array.

TYPE: ndarray

j

Column indices in the axial coordinate system. Must be broadcastable with i.

TYPE: ndarray

k

Number of 60° rotation steps. Can be any integer; reduced mod 6 internally. Defaults to 1.

TYPE: int DEFAULT: 1

RETURNS DESCRIPTION
tuple[numpy.ndarray, numpy.ndarray]

Tuple (x, y) of rotated coordinates with the same broadcasted shape as inputs.

rotate_array_tri60

rotate_array_tri60(
    arr: ndarray, k: int = 1, map_only_nonzero: bool = False, return_shift: bool = False
)

Rotate a 2D occupancy array by k×60° on a triangular lattice.

The array is assumed to live on axial coordinates (i=row, j=col). Rotation is implemented by transforming indices, shifting them to be non-negative, and scattering values into a tightly sized output array.

PARAMETER DESCRIPTION
arr

2D input array (e.g., uint8 occupancy). The dtype and sparsity are preserved in the output; shape generally changes with rotation.

TYPE: ndarray

k

Number of 60° rotation steps. Any integer; reduced mod 6 internally. Defaults to 1.

TYPE: int DEFAULT: 1

map_only_nonzero

If True, only non-zero entries are transformed and written, which is typically faster for sparse integer masks. Defaults to False.

TYPE: bool DEFAULT: False

return_shift

If True, returns a tuple (rotated, shift_x, shift_y) where shift_* are the offsets added to make all indices >= 0. Defaults to False.

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
numpy.ndarray | tuple[numpy.ndarray, int, int] .. note:: Because rotation is done in index space, the output shape depends on k and the input's footprint on the lattice. Expect different bounding boxes.

Rotated array, optionally with integer shifts if return_shift=True.