Reply on RC2

“Regarding the extreme temperature representation, based on data at the daily frequency, it has been shown that GCMs tend to have warm bias over most land areas (Li et al., 2021) and the horizontal resolution plays a minor role with respect to the one played in the extreme precipitation representation (Kharin et al. 2013; Wei et al., 2019). Typically the warm extremes are computed based on maximum daily temperature, but in this work we want to verify the potential improvements induced by the increased resolution in the representation of extreme temperature events defined at two different time frequency (daily and 6-hourly). For this reason we investigate the distribution of daily and 6-houry average temperature, instead of maximum daily temperature (Scoccimarro and Navarra, 2020).

"Regarding the extreme temperature representation, based on data at the daily frequency, it has been shown that GCMs tend to have warm bias over most land areas (Li et al., 2021) and the horizontal resolution plays a minor role with respect to the one played in the extreme precipitation representation (Kharin et al. 2013;Wei et al., 2019). Typically the warm extremes are computed based on maximum daily temperature, but in this work we want to verify the potential improvements induced by the increased resolution in the representation of extreme temperature events defined at two different time frequency (daily and 6-hourly). For this reason we investigate the distribution of daily and 6-houry average temperature, instead of maximum daily temperature (Scoccimarro and Navarra, 2020).
Regarding the extreme precipitation representation, Based on simulations from single GCM, some improvement in skill at higher resolution for some measures of extreme precipitation over certain regions of the globe have been found in the past (Wehner et al. 2014, Kopparla et al. 2013 and only recently, multi-model assessment on this topic have been done, confirming that increasing the horizontal resolution to ¼ of degree (the highest adopted by the model object of this study), the magnitude of simulated daily (Bador et al. 2020) and sub-daily precipitation (Wehner et al. 2021) extremes is increased. On the other hand this is not associated to a systematic improvement in the simulation of precipitation extremes when compared to observations and, quantitatively, at the global scale, the intensification of precipitation extremes at increased resolution varies substantially from model to model (Bador et al. 2020). Also, for grid point GCMs (as opposed to spectral GCMs), the fraction of land precipitation increases, largely due to better resolved orography (Vannière et al., 2019;Terai et al., 2018;Demory et al., 2014)." While I agree with using two products to evaluate the model's precipitation, I wouldhave to disagree with the use of ERA5 for that purpose. Precipitation is not assimilated in reanalyses and is thus a product of the model used to create it. Although ERA is a superior product to its predecessor, there are many known issues with ERA5 precipitation. See for example: It would thus be better to use another observational product to evaluate the model.

Rivoire
In the new version of the manuscript we do not rely on ERA5 precipitation for model evaluation. Also following the Reviewer #1 comment on the same topic, we decided to use a new observational dataset (in addition to the already involved CHIRPS dataset) instead of ERA5. The MSWEP (Beck et al. 2019) dataset is a global precipitation product with a 3-hourly 0.1° resolution available at a 3-hourly temporal resolution, covering the period from 1979 to the near present. The dataset takes advantage of the complementary strengths of gauge-, satellite-, and reanalysis-based data to provide reliable precipitation estimates over the globe. With this dataset we compute seasonal averages and both daily and 6-hourly percentiles to evaluate model results (same as done based on ERA5 in the previous version of the paper, but over a shorter period . The three suggested references have been added to the text to justify the choice to do not use the ERA5 precipitation for comparison. This is the sentence added to the text: "Since there are many known issues with ERA5 precipitation (Rivoire et al., 2021;Hu et al., 2020;Crosset et al. 2020 We agree that looking at the distribution of daily tasmax is different from looking at the distribution (and relative tails) of average daily temperature, but the approach used in the current manuscript (also used in Scoccimarro and Navarra, 2021) has some advantages such as the fact that tasmax parameter depends on the model time step length (different in the different versions of the model) while average temperature (daily or 6-hourly) is independent from the model time step. Also, the usage of values averaged over a period (daily or 6-hourly) instead of tasmax gives an information more exhaustive from the human health perspective: e.g. a few minutes (model time step) with 42 o C might be less problematic for the human body than 6 hours at 38 o C.
In addition, since tasmax is defined only at the daily frequency (this is true for all of the CMIP5 and CMIP6 model output available on ESGF), it is impossible to compare the model horizontal resolution role in representing daily and 6-hourly statistics.
Last but not least, we recently retrieved CMCC-HR4 and CMCC-VHR4 tasmax and tasmin fields from the ESGF repository because we found a bug on both these daily datasets.
The color schemes used to present the data makes it difficult to understand the results.

For one, it saturates very quickly. For example, on Fig. 1, it is nearly impossible to distinguish values between -6 and -20 (when it is printed on paper). And also, there are similar colors on both sides of the 0 point (e.g. green on Fig 3.). It made reading through the precipitation subsection particularly difficult, as I couldn't get a good sense of the size of the biases that were being shown. My suggestion would be to refer to the IPCC visual style guide: https://www.ipcc.ch /site/assets/uploads/2019/04/IPCC-visual-style-guide.pdf
Following this suggestion, In the new version of the manuscript color schemes have been defined following the IPCC visual style guide. We also had to modify the structure of the figures to put the model fields, the reference and the bias fields into the same figure, dividing 6h/24h and DJF/JJA to follow reviewer #1 request.
I think the manuscript would benefit from an attempt at explaining some of the results that are presented. The authors described the convection scheme in Section 2.1, because "it is worthwhile to mention for our discussion on precipitation biases", but the convection scheme is never referred to when the results are presented. Does it explain the differences between results obtained with 6-hourly data and daily mean? And if so, how? Could dry biases in the model play a role in the extreme of near surface temperature? Was the impact of resolution on extreme temperature and precipitation evaluated by other groups using the CAM model? Are the results consistent with those found here? Furtheremore, Vaniere et al. (2019) has shown a significant impact of resolution on precipitation over mountainous areas in HighResMIP models. Are the results presented here consistent with that study and others that have looked into this issue previously? Given that this is a single model study, it is difficult to evaluate if the results are model dependent. Expanding the discussion would help in that regard.

Vaniere et al. (2019) Multi-model evaluation of the sensitivity of the global energy budget and hydrological cycle to resolution. Climate Dynamics, 52, 6817-6846
We extended the description of the convection scheme in the standard version of CAM4 comparing it to the one adopted by CAM5: "In other words the deep convection scheme is triggered based on a minimum positive threshold of CAPE, same as in the standard resolution of the CAM5 model (Wang and Zhang, 2013)." And this is supporting the added text in the discussion: "The high-resolution version of the model generates excess extreme precipitation in the wet, warm regions, or seasons, consistently with findings based on experiments carried out with the CAM5 atmospheric model at the same resolutions (Wehner et al, 2014), highlighting once again the importance of an extensive model tuning at the high resolution".
The differences between extreme precipitation biases in 6h and daily data moving from standard to high resolution is not that evident, thus we didn't link this to the description of the precipitation parameterization.
Regarding the role of dry biases we assume that this comment is related the 99p bias since the average precipitation (S3 and S4 figures in the new version of the manuscript) tends to show a wet bias. A first investigation on the role of such dry bias in modulating extreme near surface temperature, does not suggest a systematic relationship: the bias of the 99p of precipitation during summer ( Figure 6) is dry for the low resolution model but wet for the high resolution model over part of the Maritime Continent and South America, while the bias in 99p of near surface temperature is positive for both models, over the same regions (Figure 4). During winter, for both models, the most pronounced positive biases in 99p of temperature ( Figure 3)  Minor comments CMCC-CM2-HR and CMCC-CM2-VHR might be the name of the models, but it is strange to refer to a model with a resolution of 1 deg (for the atmospheric component) as high resolution. It might be easier for the reader to simply refer to the two configurations as standard resolution (1 deg) and high resolution (0.25 deg). To be clear, I am not suggesting changing the name of the models, but simply to use the terms standard resolution and high resolution (or something along those lines) when referring to MCCCM2-HR and MCC-CM2-VHR.
Done: In the new version of the manuscript we use the terms "standard" and "high" instead of "high" and "very-high".
The authors should mention the name of the experiment from which the data are taken. Vaniere et al. (2019) noted different responses in terms of the impact of resolution on precipitation between grid point and spectral models. As such, the type of atmospheric model should be highlighted and the authors should mention whether their results are consistent with that prior study.
We added a sentence in 2.1 section to indicate the experiment from which the data are taken: "In the current analysis we investigate the hist-1950 HighResMIP experiment as described in section 2.3.".
Also we made explicit the grid point configuration in section 2.1: "The CMCC general circulation has been developed in several configurations (Cherchi et al. 2019). The model uses as atmospheric component the CAM Atmospheric component (CAM4, Neale et al. 2010)

in its grid point configuration"
A comparison of CMCC-CM2 model results to other HigResMIP results, based on Venniere et al. analysis, is now part of the discussion (see the answer to your last major comment). The sentence has been removed also in accord with the rephrasing of the previous one.