Skip to content

Commit

Permalink
This draft for odc: auto compilation ( hidden switching mechanism). f…
Browse files Browse the repository at this point in the history
…circle-qcircle step was implemented with compilation

Conv2D_000.circle  is used as model for test (it was added just for this draft verification)

This draft contains auto compilation, which is executed in `nnfw_run_auto_compilation` function.

- `nnfw_run_auto_compilation` function was added.
- QuantizerManager, quantizer and MinMaxReader were changes for set and minmax threshold and cheking readiness for quantization. Function for removing minmax file was added.
- OdcInfo class for storaging actual state of the odc was added.
- for compilation step it uses "session::codegen" function

After model quantization, the removing minmax file step was added.
After first loading quantized model the following sequence was implemented:

- save the inputs and outputs buffers of the initial fcricle model
- attempt to compile of the quantized model and load compiled model,  if it fails, try to load the quantized model
- recover inputs and outputs

after that

- do inference of the compiled or quantized model

Positive and negative tests were added

ONE-DCO-1.0-Signed-off-by:  Evgenii Maltsev [email protected]
  • Loading branch information
Torrero committed Oct 18, 2024
1 parent 8f37c80 commit fe7d8ce
Show file tree
Hide file tree
Showing 20 changed files with 854 additions and 5 deletions.
Binary file added runtime/contrib/Odc/res/Conv2D_000.circle
Binary file not shown.
62 changes: 62 additions & 0 deletions runtime/onert/api/nnfw/include/nnfw_experimental.h
Original file line number Diff line number Diff line change
Expand Up @@ -530,6 +530,67 @@ NNFW_STATUS nnfw_set_codegen_model_path(nnfw_session *session, const char *path)
*/
NNFW_STATUS nnfw_codegen(nnfw_session *session, const char *target, NNFW_CODEGEN_PREF pref);

/**
* @brief Set quantization type
*
* This function should be called before {@link nnfw_quantize} is invoked.
*
* @param[in] session nnfw_session to set quantization type
* @param[in] pref @c NNFW_QUANTIZE_TYPE
* @return @c NNFW_STATUS_NO_ERROR if successful,
* @c NNFW_STATUS_UNEXPECTED_NULL if session is null,
* otherwise return @c NNFW_STATUS_ERROR
*/
NNFW_STATUS nnfw_set_quantization_type(nnfw_session *session, NNFW_QUANTIZE_TYPE qtype);


/**
* @brief Set MinMax records count in auto compilation mode with on-device compiler
*
* This function set MinMax records count for quantization in auto compilation mode.
* To enable automatic compilation mode, use {@link nnfw_run_with_auto_compilation}
*
* @param[in] session nnfw_session
* @param[in] minmax_records_count minmax records count
* @return @c NNFW_STATUS_NO_ERROR if successful, otherwise return @c NNFW_STATUS_ERROR
*/
NNFW_STATUS nnfw_set_odc_param_minmax_records_count(nnfw_session *session,
int minmax_records_count);

/**
* @brief Delete MinMax file for auto compilation mode
*
* @param[in] session nnfw_session
* @return @c NNFW_STATUS_NO_ERROR if successful, otherwise return @c NNFW_STATUS_ERROR
*/
NNFW_STATUS nnfw_odc_delete_minmax_file(nnfw_session *session);

/**
* @brief Run inference with auto compilation
*
* <p>This function runs inference with automatic compilation and replaces
* the original model with a quantized or compiled model inside.
* On-device compiler (ODC) provides quantization and compilation functionality.
* Function should be called after model is loaded by {@link nnfw_load_model_from_file},
* session is prepared for inference by {@link nnfw_prepare}, set input and output buffers
* by {@link nnfw_set_input} and {@link nnfw_set_output}.
*
* Additionaly the following parameters should be set up :
* 1. Quantization type {@link nnfw_set_quantization_type }
* 2. Quantizated model path {@link nnfw_set_quantized_model_path }
* 3. Minmax records threshold for quantization {@link nnfw_set_odc_param_minmax_records_count }
* 3. File with minMax statistics can be removed by {@link nnfw_odc_delete_minmax_file}
* 4. Compiled model path {@link nnfw_set_codegen_model_path}
* </p>
*
* @param[in] session nnfw_session
* @param[in] target Target backend to generate code as in {@link nnfw_codegen}
* @param[in] pref @c NNFW_CODEGEN_PREF
* @return @c NNFW_STATUS_NO_ERROR if successful, otherwise return @c NNFW_STATUS_ERROR
*/
NNFW_STATUS nnfw_run_with_auto_compilation(nnfw_session *session, const char *target, NNFW_CODEGEN_PREF pref);

//////////////////////////////////////////////
// APIs for configuration
//////////////////////////////////////////////
Expand Down Expand Up @@ -589,6 +650,7 @@ typedef enum
* TODO: Use workspace
*/
NNFW_RUN_CONFIG_PROFILE,

} NNFW_RUN_CONFIG;

/**
Expand Down
19 changes: 19 additions & 0 deletions runtime/onert/api/nnfw/src/nnfw_api.cc
Original file line number Diff line number Diff line change
Expand Up @@ -496,6 +496,25 @@ NNFW_STATUS nnfw_codegen(nnfw_session *session, const char *target, NNFW_CODEGEN
return session->codegen(target, pref);
}

NNFW_STATUS nnfw_set_odc_param_minmax_records_count(nnfw_session *session, int minmax_records_count)
{
NNFW_RETURN_ERROR_IF_NULL(session);
return session->set_odc_param_minmax_records_count(minmax_records_count);
}

NNFW_STATUS nnfw_odc_delete_minmax_file(nnfw_session *session)
{
NNFW_RETURN_ERROR_IF_NULL(session);
return session->delete_odc_minmax_file();
}

NNFW_STATUS nnfw_run_with_auto_compilation(nnfw_session *session, const char *target,
NNFW_CODEGEN_PREF pref)
{
NNFW_RETURN_ERROR_IF_NULL(session);
return session->run_with_auto_compilation(target, pref);
}

// Configuration

NNFW_STATUS nnfw_set_prepare_config(nnfw_session *session, const NNFW_PREPARE_CONFIG key,
Expand Down
Loading

0 comments on commit fe7d8ce

Please sign in to comment.