anuga-cuda CUDA Code Structure

Advanced Version Class

class anuga_cuda.gpu_domain_advanced.CUDA_advanced_domain(coordinates=None, vertices=None, boundary=None, source=None, triangles=None, conserved_quantities=None, evolved_quantities=None, other_quantities=None, tagged_elements=None, geo_reference=None, use_inscribed_circle=False, mesh_filename=None, use_cache=False, verbose=False, full_send_dict=None, ghost_recv_dict=None, starttime=0.0, processor=0, numproc=1, number_of_full_nodes=None, number_of_full_triangles=None, ghost_layer_width=2, using_gpu=False, cotesting=False, stream=False, rearranged=False, domain=None)[source]

This is the CUDA based ANUGA domain in advanced version

allocate_device_array()[source]

Allocate device memory.

apply_fractional_steps()[source]

Overrided function only for testing purpose.

apply_protection_against_isolated_degenerate_timesteps()[source]

Overrided function since max_speed array is required to be downloaded from device memory.

asynchronous_transfer()[source]

Upload mesh information from host to device.

When using page-locked host memory to store all mesh information, asynchronous transfer data can hide transmission overheads by proceeding several transfers together and overlapping with kernel executing.

backup_conserved_quantities()[source]

Overrided function to use device memory to temporarily backup the centroid_values for quantity instances.

balance_deep_and_shallow()[source]

Overrided function to invoke balance_deep_and_shallow kernel function.

compute_fluxes()[source]

Overrided function to invoke compute_fluxes and gravity series kernel functions, and download calculated timestep information.

compute_forcing_terms()[source]

Overrided function to invoke kernel version forcing term functions.

copy_back_necessary_data()[source]

Download results from device.

distribute_to_vertices_and_edges()[source]

Overrided function to invoke protect series kernel functions.

ensure_numeric(A, typecode=None)[source]

From numerical_tools

equip_kernel_functions()[source]

Compile and equip kernel codes.

Equip all the kernel functions and get the appropriate thread block configuration. Also set up the prepared call with specifying parameter types for all the kernel functions.

evolve(yieldstep=None, finaltime=None, duration=None, skip_initial_step=False)[source]
evolve_one_euler_step(yieldstep, finaltime)[source]
evolve_one_rk2_step(yieldstep, finaltime)[source]
evolve_one_rk3_step(yieldstep, finaltime)[source]
extrapolate_second_order_sw()[source]

Overrided function to invoke extrapolate_velocity_second_order, extrapolate_second_order_sw_true and extrapolate_second_order_sw_false kernel functions.

get_absolute(points)[source]

From geo_reference get_absolute

get_vertex_coordinates(triangle_id=None, absolute=False)[source]
lock_host_page()[source]

Use page-locked memory

Register host pageable memory to lock their page. This should be done when using the asynchronous transfer.

manning_friction_explicit()[source]

Overrided function to invoke manning_friction_sloped and manning_friction_flat kernel functions.

manning_friction_implicit()[source]

Overrided function to invoke manning_friction_sloped and manning_friction_flat kernel functions.

protect_against_infinitesimal_and_negative_heights()[source]

Overrided function to invoke protect series kernel functions.

saxpy_conserved_quantities(a, b)[source]

Overrided function to invoke saxpy_centroid_values kernel function.

store_timestep()[source]
update_boundary()[source]
update_centroids_of_velocities_and_height()[source]

Overrided function to invoke set_boundary_values_from_edges and update_centroids_of_velocities_and_height kernel functions.

update_conserved_quantities()[source]

Overrided function to invoke update kernel function and for each quantity set device memory of semi_implicit_update to 0.

update_extrema()[source]
update_ghosts()[source]

Overrided function only for testing purpose.

update_other_quantities()[source]
update_timestep(yieldstep, finaltime)[source]

Overrided function only for testing purpose.

using_stream = None

A Boolean variable denotes whether to use stream (concurrent kernel technology). Default value is False. Also if device not support such technology, this value will be set to False.

Basic Version Class

class anuga_cuda.gpu_domain_basic.CUDA_basic_domain(coordinates=None, vertices=None, boundary=None, source=None, triangles=None, conserved_quantities=None, evolved_quantities=None, other_quantities=None, tagged_elements=None, geo_reference=None, use_inscribed_circle=False, mesh_filename=None, use_cache=False, verbose=False, full_send_dict=None, ghost_recv_dict=None, starttime=0.0, processor=0, numproc=1, number_of_full_nodes=None, number_of_full_triangles=None, ghost_layer_width=2, using_gpu=False, cotesting=False, stream=False, rearranged=False, domain=None)[source]

Table Of Contents

Previous topic

Welcome to anuga-cuda’s documentation!

Next topic

anuga-cuda OpenHMPP Code Structure

This Page