Reply on RC1

standard in each O a long-term using the long-term standard error in Δ and the pseudo-linear relationship between internal 1SE on Δ 47 and either temperature or δ 18 O sw in all samples (for this study: long-term 1SE in Temperature = 306.7*(long-term 1SE in Δ 47 ) + 0.1778 and long-term 1SE in δ 18 O sw = 66.33*(long-term 1SE in Δ 47 ) + 0.0461). The external 1SE for all three parameters was selected as the larger of the internal and long-term standard error. In all plots, we use the external 1SE for temperature or d18Osw.

A: As you point out, our early clumped isotope data was collected on one machine (MAT 253) and corrected using a gas-based reference frame and our later data was collected on another instrument (Nu Perspective) and corrected into the I-CDES carbonate-based reference frame. For some samples not analyzed for their clumped isotopic composition, we measured just d13C and d18O using a MAT253 + Kiel IV device. To ensure consistency, alongside samples we analyzed three to five in-house carbonate standards, whose isotopic compositions were tracked through time and across machines. We have compiled a table with summary statistics on these in-house standards through time and on both machines that can go into the supplementary material of our manuscript. This table shows that d13C, d18O, and D47 values of these standards have been constant (within error) through time and across machines, so all sample data can confidently be combined and treated as a single dataset.

Q: How reproducible is your data?
A: General reproducibility can be demonstrated by the long-term standard deviation of the above-mentioned in-house carbonate standards which are measured frequently yet not used in the absolute reference frame correction itself. For the MAT253 analyses, long-term 1sd on D47 is 0.020 permil. For the Nu Perspective, long-term 1sd on D47 is 0.018 permil. Long-term 1sd for d13C and d18O are generally better than 0.1 and 0.2 permil, respectively.

Q: Are your ETH standards values comparable to Intercarb values?
A: Yes, when ETH standards are used for correction to the I-CDES, we assign values from Bernasconi et al 2021, so they are by definition extremely comparable to identical to Intercarb values. ETH standards were not measured on the MAT253 when a gas-based reference frame was used.
Q: Is this the first article published with data from your Nu-perspective? If no, please cite the article that describes the procedure, if yes, please give more details on the methodology of this MS.
A: Yes it is, although we did not know it at the time of original submission. Here are additional details on the method that can be added, along with citing Mackey et al (2020) which describes details of a similar instrument used in a slightly different mode (coldfinger mode).
The NuCarb device reacts sample powders in individual vials under vacuum at 70 °C by slowly injecting 150uL of 103% phosphoric acid. Vials are isolated for the first 5 minutes of the reaction to reduce excess bubbling, then opened to a variable temperature trap held at -160 ℃ for 15 additional minutes, continuously collecting evolved CO 2 . Next, this variable-temperature trap is elevated to -60 °C and CO 2 is released to pass through a static trap filled with Porapak Q material held at -30 ℃. CO 2 is frozen on the far size in another variable temperature trap set at -160℃ for 800 seconds. CO 2 is warmed up to the gas phase and the yield is recorded with a transducer. The Nu Perspective and NuCarb can handle both larger (3-6 mg) and smaller (350-450 ug) samples. Depending on the transducer reading following purification, the automated sequence directs CO 2 from a larger sample into the bellows or freezes CO2 from a smaller sample into a cold finger. All samples in this study were between 3-5 mg in size and were therefore analysed in "bellows mode". Previous studies using a Nu Perspective and NuCarb for clumped isotope analysis described "cold-finger mode" (Mackey et al., 2020). In bellows mode, bellows are compressed until a desired beam strength of 80 nA is achieved on the major (m/z 44) ion beam. Gas is analyzed for 4 blocks of 20 reference-sample cycles, with 20 seconds of integration on each half cycle. Bellows are continually adjusted between each cycle to maintain the initial beam strength at all times. Q: How did you convert the data from the D47 into an absolute repository (software, codes)?
A: Between 2016-2020, a first set of fossil shell powders were measured for Δ 47 using a Thermo-Finnigan MAT 253 dual inlet isotope ratio mass spectrometer and a manual sample preparation device described in detail by Defliese et al. (2015). Raw Δ 47 values were placed in a gas-based absolute reference frame following methods introduced by Dennis et al. (2011), using theoretical equilibrium values for heated (1000 °C) and waterequilibrated (25 °C) standard gases (0.0266 and 0.9198, respectively) and the 75 °C acid fractionation factor of +0.072 ‰ from Petersen et al. (2019). Boundaries between each 1-3-week-long correction window were selected to account for drift in equilibrium gas line slopes and to maintain consistent values of in-house carbonate standards through time.
In 2021 and 2022, additional powders were analyzed in the University of Michigan SCIPP lab using a Nu Perspective isotope dual inlet ratio mass spectrometer connected to a NuCarb automated sample preparation device. Isotopic values were converted into the Intercarb Carbon Dioxide Equilibrium Scale (I-CDES25) absolute reference frame using Δ 47 values for four ETH standards defined by the Intercarb project (Bernasconi et al., 2021) and an acid fractionation factor of +0.066 ‰ for 70 °C from Petersen et al. (2019). First, a single slope was fitted through ETH 1 and ETH 2 (adjusted by 0.0033 ‰) in δ 47 vs. Δ 47 space, then all four ETH standards were used for the empirical transfer function step.
This description can be added to the methods section of the paper and all calculated equilibrium gas line slopes and transfer function slopes and intercepts will be included in the final data table archived in EarthChem ClumpDB.

Q: You can also specify the number of measurement sessions and how you calculate your temperature uncertainties. It must be written in the section method, not in the caption of a figure. In the legend, the definition of your uncertainties is not clear.
A: There is a blur between what counts as a "measurement session" (which might be a few months long between major maintenance or power outages) and a "correction window" (a period from which standards are selected to correct a subset of data). Multiple correction windows make up one measurement session. We describe the measurement sessions in the response above, which we can add to the methods section as written. For information about specific correction windows (which we feel is too granular for most readers), someone can easily find this information in a column labeled "window" in the EarthChem data template.
Long-term reproducibility (1sd) of Δ 47 in carbonate standards (0.020 ‰ for MAT253 and 0.018 ‰ for Nu) was used to calculate a long-term standard error in Δ 47 for each sample. For temperature and δ 18 O sw values, a long-term standard error was calculated using the long-term standard error in Δ 47 and the pseudo-linear relationship between internal 1SE on Δ 47 and either temperature or δ 18 O sw in all samples (for this study: long-term 1SE in Temperature = 306.7*(long-term 1SE in Δ 47 ) + 0.1778 and long-term 1SE in δ 18 O sw = 66.33*(long-term 1SE in Δ 47 ) + 0.0461). The external 1SE for all three parameters was selected as the larger of the internal and long-term standard error. In all plots, we use the external 1SE for temperature or d18Osw.

Q: Would it be possible to replicate some samples to reduce the uncertainties on the temperatures?
A: Since receiving this review, we have added additional replicates to the few samples with the worst reproducibility. We have also done more thorough replicate screening (description of which can be added to methods), removing a single replicate if it falls more than 2.5 standard deviations from the mean of the remaining replicates. These combined efforts improved error bars on a number of samples. Additionally, we have now analyzed a few more new samples to improve the number of samples per horizon and to make our conclusions more robust. These new data reinforce prior conclusions, such as the timing and magnitude of warming during the Late Maastrichtian Warming Event. Q: You have measured different carbonates (matrix and fossil bivalves; different species). However, you are not comparing this data. For example, how do you explain the temperature difference between Acutostrea and Agerostrea at 66.5 Ma? A discussion of this comparison would improve the manuscript.
A: We disagree with the reviewer that the temperature difference between Acutostrea and Agerostrea at 66.5 Ma is meaningful, as they are comparing two single samples from one horizon. Yes, that is the horizon with the largest variability and it happens to include multiple taxa, but we see a similar level if variability between samples of the same taxa from the same horizon. Fossils found at the same horizon were not necessarily living at exactly the same time, so may be accurately representing their living environment but the living environment is not exactly the same as for the other shell from that horizon. Also, for the most part, shells included in this study are small oysters, so it is possible that our drilled powder has aliased certain seasons in one shell vs. another.  Figure S3 in the main text, which is much more detailed.
A: We think having a zoomed-in version of the temperature profile is beneficial to see interspecies differences that are not visible in the big composite figure (previously Figure  5), but we agree that Figure 4 and S3 are redundant. We now prefer to include Figure S3 in the main text which shows the removed data points, as well as the species-level differences.
Q: A table with the data, such as the last table in the material supplement, can be included in the main text.