Hemm utilities
base64_decode_image(image)
Decodes a base64 encoded image string encoded using the function hemm.utils.base64_encode_image
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
image |
str
|
Base64 encoded image string encoded using the function |
required |
Returns:
Type | Description |
---|---|
Image
|
Image.Image: PIL Image object. |
Source code in hemm/utils.py
base64_encode_image(image_path, mimetype=None)
Converts an image to base64 encoded string to be logged and rendered on Weave dashboard.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
image_path |
Union[str, Image]
|
Path to the image or PIL Image object. |
required |
mimetype |
Optional[str]
|
Mimetype of the image. Defaults to None. |
None
|
Returns:
Name | Type | Description |
---|---|---|
str |
str
|
Base64 encoded image string. |
Source code in hemm/utils.py
publish_dataset_to_weave(dataset_path, dataset_name=None, prompt_column=None, ground_truth_image_column=None, split=None, data_limit=None, get_weave_dataset_reference=True, dataset_transforms=None, column_transforms=None, dump_dir='./dump', *args, **kwargs)
Publishes a HuggingFace dataset dictionary dataset as a Weave dataset.
Publish a subset of MSCOCO from Huggingface as a Weave Dataset
import weave
from hemm.utils import publish_dataset_to_weave
if __name__ == "__main__":
weave.init(project_name="t2i_eval")
dataset_reference = publish_dataset_to_weave(
dataset_path="HuggingFaceM4/COCO",
prompt_column="sentences",
ground_truth_image_column="image",
split="validation",
dataset_transforms=[
lambda item: {**item, "sentences": item["sentences"]["raw"]}
],
data_limit=5,
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_path |
[type]
|
Path to the HuggingFace dataset. |
required |
dataset_name |
Optional[str]
|
Name of the Weave dataset. |
None
|
prompt_column |
Optional[str]
|
Column name for prompt. |
None
|
ground_truth_image_column |
Optional[str]
|
Column name for ground truth image. |
None
|
split |
Optional[str]
|
Split to be used. |
None
|
data_limit |
Optional[int]
|
Limit the number of data items. |
None
|
get_weave_dataset_reference |
bool
|
Whether to return the Weave dataset reference. |
True
|
dataset_transforms |
Optional[List[Callable]]
|
List of dataset transforms. |
None
|
column_transforms |
Optional[Dict[str, Callable]]
|
Column specific transforms. |
None
|
dump_dir |
Optional[str]
|
Directory to dump the results. |
'./dump'
|
Returns:
Type | Description |
---|---|
Union[ObjectRef, None]
|
Union[ObjectRef, None]: Weave dataset reference if get_weave_dataset_reference is True. |
Source code in hemm/utils.py
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 |
|
save_weave_dataset_rows_to_artifacts(dataset_rows, dump_dir)
Saves the dataset rows to W&B artifacts.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_rows |
List[Dict]
|
List of dataset rows. |
required |
dump_dir |
str
|
Directory to dump the results. |
required |