Preparations for multivariate plotting#29877
Conversation
This commit introduces the MultiNorm calss to prepare for the introduction of multivariate plotting methods
| return x | ||
| else: | ||
| # in case of a dtype with multiple fields: | ||
| try: |
There was a problem hiding this comment.
Would be good to get at least partial coverage for this branch.
There was a problem hiding this comment.
I haven't really been involved in this work nor understand how it works, but there is quite a bit of introduced code to deal with multiple datatypes? If this will be covered by tests/functionality in later PRs, that is fine, if not, please add tests for (most of) it.
There was a problem hiding this comment.
| if self.norm.n_output != cmap_obj.n_variates: | ||
| raise ValueError(f"The colormap {cmap} does not support " | ||
| f"{self.norm.n_output} variates as required by " | ||
| f"the {type(self.norm)} on this Colorizer.") |
There was a problem hiding this comment.
Error messages typically have no end dot (same comment applies throughout).
There was a problem hiding this comment.
Thanks, I'll need to change this in the other PR as well.
| mask = np.empty(x.shape, dtype=np.dtype('bool, '*len(x.dtype.descr))) | ||
| for dd, dm in zip(x.dtype.descr, mask.dtype.descr): | ||
| mask[dm[0]] = ~(np.isfinite(x[dd[0]])) | ||
| xm = np.ma.array(x, mask=mask, copy=False) |
There was a problem hiding this comment.
Do numpy masked arrays actually support struct arrays as mask, with possibly different masking of the fields?
There was a problem hiding this comment.
I have found that this is the only way numpy supports masking dtypes with multiple fields, but I will see if [("mask", bool, len(x.dtype.descr))] as you suggest bellow is a reasonable approach to using a single mask.
| else: | ||
| # in case of a dtype with multiple fields: | ||
| try: | ||
| mask = np.empty(x.shape, dtype=np.dtype('bool, '*len(x.dtype.descr))) |
There was a problem hiding this comment.
Could the dtype be e.g. [("mask", bool, len(x.dtype.descr))] (with a slightly different API)?
There was a problem hiding this comment.
This is an interesting idea. I'll make a prototype and see if this would add unnecessary complexity somewhere else.
54a945c to
eeb895c
Compare
41acef7 to
9c62126
Compare
|
@anntzer I think this is important, so I wanted to reply to this in the main thread.
The context here is that mulrivariate data is stored internally as an array with a data type with multiple fields. It should be noted that when a regular np.array is masked, and the mask is I didn't actually get as far as to prototype this, but I did have a look around. I have found that it will largely involve changes to I have tried to list the advantages/disadvantages of the two approaches below: A: Use a masked array with a struct array.
Advantages:
Disadvantages:
B: store the mask as an additional dtype in the struct array i.e.
Advantages:
Disadvantages:
Having looked at this, my personal opinion is that option A is more suitable for matplotlib because I think it will be easier to maintain. @anntzer let me know if I have interpreted your suggestion correctly, and if you agree with my assessment of approach A or B, or if you think I should make a full prototype to explore this further. |
Thank you @QuLogic Co-authored-by: Elliott Sales de Andrade <[email protected]>
9c62126 to
a276d89
Compare
|
This PR is superseded by #30511 |
PR summary
This PR continues the work of #28658 and #28454 and #29876, aiming to close #14168. (Feature request: Bivariate colormapping)
This is part two of the former PR, #29221, and builds upon #29876. Please see #29221 for the previous discussion
#29876 includes:
MultiNormclass. This is a subclass ofcolors.Normalizeand holdsn_variatenorms.MultiNormclassThis PR includes in this PR:
Features not included in this PR:
MultiNormtogether withBivarColormapandMultivarColormapto the plotting functionsaxes.imshow(...),axes.pcolor, and `axes.pcolormesh(...)