{"id":348,"date":"2023-02-15T21:32:50","date_gmt":"2023-02-15T21:32:50","guid":{"rendered":"https:\/\/todaysainews.com\/index.php\/2023\/02\/15\/a-novel-differentially-private-aggregation-framework-google-ai-blog\/"},"modified":"2025-04-27T07:34:18","modified_gmt":"2025-04-27T07:34:18","slug":"a-novel-differentially-private-aggregation-framework-google-ai-blog","status":"publish","type":"post","link":"https:\/\/todaysainews.com\/index.php\/2023\/02\/15\/a-novel-differentially-private-aggregation-framework-google-ai-blog\/","title":{"rendered":"A novel differentially private aggregation framework \u2013 Google AI Blog"},"content":{"rendered":"<p> [ad_1]<br \/>\n<\/p>\n<div id=\"post-body-6601913534843092690\">\n<span class=\"byline-author\">Posted by Haim Kaplan and Yishay Mansour, Research Scientists, Google Research<br \/>\n<\/span><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEiyPiYFN7skGuyMGaA0hHSsxTAKTFT89VBXkdhNLr49dxiNrIS7UIecjPctWWrbwt_YkR0lJUq5k8J2PQrwq_PAq5tkQ6IbdzTo-asO0K2f3-KsBB--iOpklbOVeSd3mOgsOyE5fnR22-Vu-aKN4YNfYyh93McgH6ooJrCvN3dZ21Wo41X2OSK6405PtQ\/s16000\/FriendlyCore%20hero.png\" style=\"display: none;\"\/><\/p>\n<p>\n<a href=\"https:\/\/en.wikipedia.org\/wiki\/Differential_privacy\">Differential privacy<\/a> (DP) machine learning algorithms protect user data by limiting the effect of each data point on an aggregated output with a mathematical guarantee. Intuitively the guarantee implies that changing a single user\u2019s contribution should not significantly change the output distribution of the DP algorithm.\n<\/p>\n<p><a name=\"more\"\/><\/p>\n<p>\nHowever, DP algorithms tend to be less accurate than their non-private counterparts because satisfying DP is a <em>worst-case<\/em> requirement: one has to add noise to \u201chide\u201d changes in any <em>potential<\/em> input point, including &#8220;unlikely points\u2019\u2019 that have a significant impact on the aggregation. For example, suppose we want to privately estimate the average of a dataset, and we know that a sphere of diameter, \u039b, contains all possible data points. The sensitivity of the average to a single point is bounded by \u039b, and therefore it suffices to add noise proportional to \u039b to each coordinate of the average to ensure DP.\n<\/p>\n<table align=\"center\" cellpadding=\"0\" cellspacing=\"0\" class=\"tr-caption-container\" style=\"margin-left: auto; margin-right: auto;\">\n<tbody>\n<tr>\n<td style=\"text-align: center;\"><a href=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEhACWru1FVCgdudxwNL9k3HUZlYPqmE6dvxaqNTleid5dIMFnYKvx38IV39g0UuCvdToSBETlVltZFUNilTE2hIoK_CNGF9NG1jXpAkL6sVVw2yIbC5YUHrNryL5J3O2o0lq5jGnXFXSw1NkVqx4mB1Y2FRtKHrD7000O9K6IVtkk4kmwz5pQ2WeWI84w\/s185\/image25.png\" style=\"margin-left: auto; margin-right: auto;\"><img decoding=\"async\" border=\"0\" data-original-height=\"184\" data-original-width=\"185\" src=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEhACWru1FVCgdudxwNL9k3HUZlYPqmE6dvxaqNTleid5dIMFnYKvx38IV39g0UuCvdToSBETlVltZFUNilTE2hIoK_CNGF9NG1jXpAkL6sVVw2yIbC5YUHrNryL5J3O2o0lq5jGnXFXSw1NkVqx4mB1Y2FRtKHrD7000O9K6IVtkk4kmwz5pQ2WeWI84w\/s16000\/image25.png\"\/><\/a><\/td>\n<\/tr>\n<tr>\n<td class=\"tr-caption\" style=\"text-align: center;\">A sphere of diameter \u039b containing all possible data points.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>\nNow assume that all the data points are &#8220;friendly,&#8221; meaning they are close together, and each affects the average by at most \ud835\udc5f, which is much smaller than \u039b. Still, the traditional way for ensuring DP requires adding noise proportional to \u039b to account for a neighboring dataset that contains one additional &#8220;unfriendly&#8221; point that is unlikely to be sampled.\n<\/p>\n<table align=\"center\" cellpadding=\"0\" cellspacing=\"0\" class=\"tr-caption-container\" style=\"margin-left: auto; margin-right: auto;\">\n<tbody>\n<tr>\n<td style=\"text-align: center;\"><a href=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEgNSPP9i8OTDLApTKxzIh5-Lb-CqKp67zJTxHCUi-EmcN9c5t9SDbOt1jcUx1DVi8bhWKEw8u8UsSLlAIxfYmuNZffZnfa9OsCe_E7CYcq3XxZetzsu5VVEsHQ1GJBzCDIRfeyg6R4LRjDc5Z0_TTlMbMzCOBxP6_3feBixK56zHtP6mvjRbF7QgBd63g\/s455\/image29.png\" style=\"margin-left: auto; margin-right: auto;\"><img decoding=\"async\" border=\"0\" data-original-height=\"208\" data-original-width=\"455\" src=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEgNSPP9i8OTDLApTKxzIh5-Lb-CqKp67zJTxHCUi-EmcN9c5t9SDbOt1jcUx1DVi8bhWKEw8u8UsSLlAIxfYmuNZffZnfa9OsCe_E7CYcq3XxZetzsu5VVEsHQ1GJBzCDIRfeyg6R4LRjDc5Z0_TTlMbMzCOBxP6_3feBixK56zHtP6mvjRbF7QgBd63g\/s16000\/image29.png\"\/><\/a><\/td>\n<\/tr>\n<tr>\n<td class=\"tr-caption\" style=\"text-align: center;\">Two adjacent datasets that differ in a single outlier. A DP algorithm would have to add noise proportional to \u039b to each coordinate to hide this outlier.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>\nIn \u201c<a href=\"https:\/\/proceedings.mlr.press\/v162\/tsfadia22a\/tsfadia22a.pdf\">FriendlyCore: Practical Differentially Private Aggregation<\/a>\u201d, presented at <a href=\"https:\/\/icml.cc\/Conferences\/2022\">ICML 2022<\/a>, we introduce a general framework for computing differentially private aggregations. The FriendlyCore framework pre-processes data, extracting a \u201cfriendly\u201d subset (the core) and consequently reducing the private aggregation error seen with traditional DP algorithms. The private aggregation step adds less noise since we do not need to account for unfriendly points that negatively impact the aggregation.\n<\/p>\n<p>\nIn the averaging example, we first apply <em>FriendlyCore<\/em> to remove outliers, and in the aggregation step, we add noise proportional to \ud835\udc5f (not \u039b). The challenge is to make our overall algorithm (outlier removal + aggregation) differentially private. This constrains our outlier removal scheme and stabilizes the algorithm so that two adjacent inputs that differ by a single point (outlier or not) should produce any (friendly) output with similar probabilities.\n<\/p>\n<h2>FriendlyCore Framework<\/h2>\n<p>\nWe begin by formalizing when a dataset is considered <em>friendly<\/em>, which depends on the type of aggregation needed and should capture datasets for which the sensitivity of the aggregate is small. For example, if the aggregate is averaging, the term <em>friendly <\/em>should capture datasets with a small diameter.\n<\/p>\n<p>\nTo abstract away the particular application, we define friendliness using a predicate \ud835\udc53 that is positive on points \ud835\udc65 and \ud835\udc66 if they are \u201cclose\u201d to each other. For example,in the averaging application \ud835\udc65 and \ud835\udc66 are close if the distance between them is less than \ud835\udc5f. We say that a dataset is friendly (for this predicate) if every pair of points \ud835\udc65 and \ud835\udc66 are both close to a third point \ud835\udc67 (not necessarily in the data).\n<\/p>\n<p>\nOnce we have fixed \ud835\udc53 and defined when a dataset is friendly, two tasks remain. First<strong>,<\/strong> we construct the FriendlyCore algorithm<em> <\/em>that extracts a large friendly subset (the core) of the input stably. <em>FriendlyCore<\/em> is a filter satisfying two requirements: (1) It has to remove outliers to keep only elements that are close to many others in the core, and (2) for neighboring datasets that differ by a single element, \ud835\udc66, the filter outputs each element except \ud835\udc66 with almost the same probability. Furthermore, the union of the cores extracted from these neighboring datasets is friendly.\n<\/p>\n<p>\nThe idea underlying <em>FriendlyCore<\/em> is simple: The probability that we add a point, \ud835\udc65, to the core is a monotonic and stable function of the number of elements close to \ud835\udc65. In particular, if \ud835\udc65 is close to all other points, it\u2019s not considered an outlier and can be kept in the core with probability 1.\n<\/p>\n<p>\nSecond, we develop the <em>Friendly DP <\/em>algorithm that satisfies a weaker notion of privacy by adding less noise to the aggregate. This means that the outcomes of the aggregation are guaranteed to be similar only for neighboring datasets \ud835\udc36 and \ud835\udc36&#8217; such that the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Union_(set_theory)\">union<\/a> of \ud835\udc36 and \ud835\udc36&#8217; is <em>friendly<\/em>.\n<\/p>\n<p>\n<a href=\"https:\/\/proceedings.mlr.press\/v162\/tsfadia22a.html\">Our main theorem<\/a> states that if we apply a friendly DP aggregation algorithm to the core produced by a filter with the requirements listed above, then this composition is differentially private in the regular sense.\n<\/p>\n<h2>Clustering and other applications<\/h2>\n<p>\nOther applications of our aggregation method are <a href=\"https:\/\/en.wikipedia.org\/wiki\/Cluster_analysis\">clustering<\/a> and learning the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Covariance_matrix#:~:text=In%20probability%20theory%20and%20statistics,of%20a%20given%20random%20vector\">covariance matrix<\/a> of a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Normal_distribution\">Gaussian distribution<\/a>. Consider the use of FriendlyCore to develop a differentially private <a href=\"http:\/\/proceedings.mlr.press\/v139\/cohen21c.html\">k-means clustering algorithm<\/a>. Given a database of points, we partition it into random equal-size smaller subsets and run a good <i>non<\/i>-private <em>k<\/em>-means clustering algorithm on each small set. If the original dataset contains <em>k<\/em> large clusters then each smaller subset will contain a significant fraction of each of these <em>k<\/em> clusters. It follows that the tuples (ordered sets) of <em>k<\/em>-centers we get from the non-private algorithm for each small subset are similar. This dataset of tuples is expected to have a large friendly core (for an appropriate definition of closeness).\n<\/p>\n<table align=\"center\" cellpadding=\"0\" cellspacing=\"0\" class=\"tr-caption-container\" style=\"margin-left: auto; margin-right: auto;\">\n<tbody>\n<tr>\n<td style=\"text-align: center;\"><a href=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEgsXg-qq3mTeBpE6w_b4mqy7sfz3nOZo9PQWAIHg_8fihN901NZn76IZpc8-XwZcKNDafj7glWFqVbv3cUfKtZSuUnw75FM7TyjYbAoqwOMex2I33aAR5v2rHg1IIKz1oPN-rzpv5Wlt4zp-PQK1-onlUomqiaJtodXpEP83jMqkoFYuix-i35Q8ai-7g\/s1132\/FriendlyCore%20Clustering.png\" style=\"margin-left: auto; margin-right: auto;\"><img decoding=\"async\" border=\"0\" data-original-height=\"700\" data-original-width=\"1132\" src=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEgsXg-qq3mTeBpE6w_b4mqy7sfz3nOZo9PQWAIHg_8fihN901NZn76IZpc8-XwZcKNDafj7glWFqVbv3cUfKtZSuUnw75FM7TyjYbAoqwOMex2I33aAR5v2rHg1IIKz1oPN-rzpv5Wlt4zp-PQK1-onlUomqiaJtodXpEP83jMqkoFYuix-i35Q8ai-7g\/s16000\/FriendlyCore%20Clustering.png\"\/><\/a><\/td>\n<\/tr>\n<tr>\n<td class=\"tr-caption\" style=\"text-align: center;\"\/><\/tr>\n<\/tbody>\n<\/table>\n<p>\nWe use our framework to aggregate the resulting tuples of <em>k<\/em>-centers (<em>k<\/em>-tuples). We define two such <em>k<\/em>-tuples to be close if there is a matching between them such that a center is substantially closer to its mate than to any other center.\n<\/p>\n<table align=\"center\" cellpadding=\"0\" cellspacing=\"0\" class=\"tr-caption-container\" style=\"margin-left: auto; margin-right: auto;\">\n<tbody>\n<tr>\n<td style=\"text-align: center;\"><a href=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEh9nkywuZzssdSyPszJiygkDvIXR8nNUC3mdPAHSQhewuHoFEoEi9lpDrkj0YLXeNrwU6PInVbjfSsLLOh4ECRNiEPIExCv9LyH-bxDD8JJuqAavhacP5LrGlwD8Ga8Glg5S1H2zTR6VlpP2chBdfYQ7FB-u0hV86TxYqc5ri_bhyQH_A6-Dvoxg2j8mA\/s585\/image27.png\" style=\"margin-left: auto; margin-right: auto;\"><img loading=\"lazy\" decoding=\"async\" border=\"0\" data-original-height=\"384\" data-original-width=\"585\" height=\"263\" src=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEh9nkywuZzssdSyPszJiygkDvIXR8nNUC3mdPAHSQhewuHoFEoEi9lpDrkj0YLXeNrwU6PInVbjfSsLLOh4ECRNiEPIExCv9LyH-bxDD8JJuqAavhacP5LrGlwD8Ga8Glg5S1H2zTR6VlpP2chBdfYQ7FB-u0hV86TxYqc5ri_bhyQH_A6-Dvoxg2j8mA\/w400-h263\/image27.png\" width=\"400\"\/><\/a><\/td>\n<\/tr>\n<tr>\n<td class=\"tr-caption\" style=\"text-align: center;\">In this picture, any pair of the red, blue, and green tuples are close to each other, but none of them is close to the pink tuple. So the pink tuple is removed by our filter and is not in the core.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>\nWe then extract the core by our generic sampling scheme and aggregate it using the following steps:\n<\/p>\n<ol>\n<li>Pick a random <em>k<\/em>-tuple \ud835\udc47 from the core.\n<\/li>\n<li>Partition the data by putting each point in a bucket according to its closest center in \ud835\udc47.\n<\/li>\n<li>Privately average the points in each bucket to get our final <em>k<\/em>-centers.\n<\/li>\n<\/ol>\n<h2>Empirical results<\/h2>\n<p>\nBelow are the empirical results of our algorithms based on <em>FriendlyCore<\/em>. We implemented them in the <a href=\"https:\/\/arxiv.org\/abs\/1605.02065\">zero-Concentrated Differential Privacy<\/a> (zCDP) model, which gives improved accuracy in our setting (with similar privacy guarantees as the <a href=\"https:\/\/www.iacr.org\/archive\/eurocrypt2006\/40040493\/40040493.pdf\">more well-known <\/a> <a href=\"https:\/\/www.iacr.org\/archive\/eurocrypt2006\/40040493\/40040493.pdf\">(\ud835\udf16, \ud835\udeff)-DP<\/a>).\n<\/p>\n<h3>Averaging<\/h3>\n<p>\nWe tested the mean estimation of 800 samples from a spherical Gaussian with an <em>unknown<\/em> mean. We compared it to the algorithm <em><a href=\"https:\/\/arxiv.org\/abs\/2006.06618\">CoinPress<\/a><\/em>. In contrast to FriendlyCore, <em>CoinPress<\/em> requires an upper bound \ud835\udc45 on the norm of the mean. The figures below show the effect on accuracy when increasing \ud835\udc45 or the dimension \ud835\udc51. Our averaging algorithm performs better on large values of these parameters  since it is independent of \ud835\udc45 and \ud835\udc51.<\/p>\n<table align=\"center\" cellpadding=\"0\" cellspacing=\"0\" class=\"tr-caption-container\" style=\"margin-left: auto; margin-right: auto;\">\n<tbody>\n<tr>\n<td style=\"text-align: center;\"><a href=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEgW-WMWKPbPsCZ0V5UurlX79tEZN4ee6326Fs9_Fk_YRycYRtlx9qs5PoGP4qgUUN8-CRcb56PDKeb4Yi8XnDCIuqYWr4mEBGB6bkmawJZB3KW_AhjFkp03E7QYpLslDLhufYuvkXDc8rhczOlCkFwYjcmyMgJCLi-2wZ0wZQU9h_vgOCITkO4TeZV5xA\/s640\/image37.png\" style=\"margin-left: auto; margin-right: auto;\"><img decoding=\"async\" border=\"0\" data-original-height=\"480\" data-original-width=\"640\" src=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEgW-WMWKPbPsCZ0V5UurlX79tEZN4ee6326Fs9_Fk_YRycYRtlx9qs5PoGP4qgUUN8-CRcb56PDKeb4Yi8XnDCIuqYWr4mEBGB6bkmawJZB3KW_AhjFkp03E7QYpLslDLhufYuvkXDc8rhczOlCkFwYjcmyMgJCLi-2wZ0wZQU9h_vgOCITkO4TeZV5xA\/s16000\/image37.png\"\/><\/a><\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEjGqEHwWu6ZYVbOe_7lx29lYXXJqq0dmz6-HA96VfxnuD5PkPHoLrTCmQUXtVoXBhcon6-AA_I5VTEiF2flcr3aqOI83byKbapDs57G73JZquxKD1ti5fzG9kvHbXoNRzmsmFdFjUdt2irl7sjV_-OerfW5DjWBMMFLPmVSVzlftO4XI9VX1cT72h5p9Q\/s640\/image41.png\" style=\"margin-left: auto; margin-right: auto;\"><img decoding=\"async\" border=\"0\" data-original-height=\"480\" data-original-width=\"640\" src=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEjGqEHwWu6ZYVbOe_7lx29lYXXJqq0dmz6-HA96VfxnuD5PkPHoLrTCmQUXtVoXBhcon6-AA_I5VTEiF2flcr3aqOI83byKbapDs57G73JZquxKD1ti5fzG9kvHbXoNRzmsmFdFjUdt2irl7sjV_-OerfW5DjWBMMFLPmVSVzlftO4XI9VX1cT72h5p9Q\/s16000\/image41.png\"\/><\/a><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><\/p>\n<table align=\"center\" cellpadding=\"0\" cellspacing=\"0\" class=\"tr-caption-container\" style=\"margin-left: auto; margin-right: auto;\">\n<tbody>\n<tr>\n<td class=\"tr-caption\" style=\"text-align: center;\"> <strong>Left<\/strong>: Averaging in \ud835\udc51= 1000, varying \ud835\udc45. <strong>Right<\/strong>: Averaging with \ud835\udc45= \u221a\ud835\udc51, varying \ud835\udc51.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><\/p>\n<h3>Clustering<\/h3>\n<p>\nWe tested the performance of our private clustering algorithm for <em>k<\/em>-means. We compared it to the <a href=\"https:\/\/ai.googleblog.com\/2021\/10\/practical-differentially-private.html\">Chung and Kamath<\/a> algorithm that is based on recursive <a href=\"https:\/\/en.wikipedia.org\/wiki\/Locality-sensitive_hashing\">locality-sensitive hashing<\/a> (LSH-clustering). For each experiment, we performed 30 repetitions and present the medians along with the 0.1 and 0.9 quantiles. In each repetition, we normalize the losses by the loss of  <a href=\"https:\/\/en.wikipedia.org\/wiki\/K-means%2B%2B\">k-means++<\/a> (where a smaller number is better).\n<\/p>\n<p>\nThe left figure below compares the <em>k<\/em>-means results on a uniform mixture of eight separated Gaussians in two dimensions. For small values of \ud835\udc5b (the number of samples from the mixture), FriendlyCore often fails and yields inaccurate results. Yet, increasing \ud835\udc5b increases the success probability of our algorithm (because the generated tuples become closer to each other) and yields very accurate results, while LSH-clustering lags behind.\n<\/p>\n<table align=\"center\" cellpadding=\"0\" cellspacing=\"0\" class=\"tr-caption-container\" style=\"margin-left: auto; margin-right: auto;\">\n<tbody>\n<tr>\n<td style=\"text-align: center;\"><a href=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEhuBS36-rINaLT2Ojoau93citscAzZ7MIDnaj3i9ISKEmRS7dPnydehp-RCtaBOPQNeHPAJNr3vcLMnnxrEJiM95mG6DqN9ylN6T-K1KpTgeQojNcnqk79h5vUBIN5pglIY7_MQ3X5dVgu7CpEoZhvlGeIxZeWmSQgE5CCsQsv5Ct-cE_a6kw1wwe7_YQ\/s1124\/k-means%20results%20for%20varying%20number%20of%20samples.png\" style=\"margin-left: auto; margin-right: auto;\"><img decoding=\"async\" border=\"0\" data-original-height=\"554\" data-original-width=\"1124\" src=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEhuBS36-rINaLT2Ojoau93citscAzZ7MIDnaj3i9ISKEmRS7dPnydehp-RCtaBOPQNeHPAJNr3vcLMnnxrEJiM95mG6DqN9ylN6T-K1KpTgeQojNcnqk79h5vUBIN5pglIY7_MQ3X5dVgu7CpEoZhvlGeIxZeWmSQgE5CCsQsv5Ct-cE_a6kw1wwe7_YQ\/s16000\/k-means%20results%20for%20varying%20number%20of%20samples.png\"\/><\/a><\/td>\n<\/tr>\n<tr>\n<td class=\"tr-caption\" style=\"text-align: center;\"><strong>Left<\/strong>: <em>k<\/em>-means results in \ud835\udc51= 2 and <em>k<\/em>= 8, for varying \ud835\udc5b(number of samples). <strong>Right<\/strong>: A graphical illustration of the centers in one of the iterations for \ud835\udc5b= 2 X 10<sup>5<\/sup>. Green points are the centers of our algorithm and the red points are the centers of LSH-clustering.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>\nFriendlyCore also performs well on large datasets, even without clear separation into clusters. We used the <a href=\"http:\/\/archive.ics.uci.edu\/ml\/datasets\/Gas+sensor+array+under+dynamic+gas+mixtures\">Fonollosa and Huerta<\/a> gas sensors dataset that contains 8M rows, consisting of a 16-dimensional point defined by 16 sensors&#8217; measurements at a given point in time. We compared the clustering algorithms for varying <em>k<\/em>. FriendlyCore performs well except for <em>k<\/em>= 5 where it fails due to the instability of the non-private algorithm used by our method (there are two different solutions for <em>k<\/em>= 5 with similar cost that makes our approach fail since we do not get one set of tuples that are close to each other).\n<\/p>\n<table align=\"center\" cellpadding=\"0\" cellspacing=\"0\" class=\"tr-caption-container\" style=\"margin-left: auto; margin-right: auto;\">\n<tbody>\n<tr>\n<td style=\"text-align: center;\"><a href=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEhWeKtTMARlbqPPuK4i93vOOQsRTl9U8vYKYRZEJDjAjwFUKCUdgtklSN0yjo5-XCqp9KMtLlimCS0XaZDcNOJILT66C72ZR651AzOXPV3BfTloVkGtLX8s-u3pcX9Ix6xoyHled4i6tAJ7Rro5P_RBjjyDbFyCP88OWszHCPDjbOxUI-J04VSGUYN-0g\/s550\/image34.png\" style=\"margin-left: auto; margin-right: auto;\"><img decoding=\"async\" border=\"0\" data-original-height=\"500\" data-original-width=\"550\" src=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEhWeKtTMARlbqPPuK4i93vOOQsRTl9U8vYKYRZEJDjAjwFUKCUdgtklSN0yjo5-XCqp9KMtLlimCS0XaZDcNOJILT66C72ZR651AzOXPV3BfTloVkGtLX8s-u3pcX9Ix6xoyHled4i6tAJ7Rro5P_RBjjyDbFyCP88OWszHCPDjbOxUI-J04VSGUYN-0g\/s16000\/image34.png\"\/><\/a><\/td>\n<\/tr>\n<tr>\n<td class=\"tr-caption\" style=\"text-align: center;\"><em>k<\/em>-means results on gas sensors&#8217; measurements over time, varying <em>k<\/em>.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>Conclusion<\/h2>\n<p>\n<em>FriendlyCore<\/em> is a general framework for filtering metric data before privately aggregating it. The filtered data is stable and makes the aggregation less sensitive, enabling us to increase its accuracy with DP. Our algorithms outperform private algorithms tailored for averaging and clustering, and we believe this technique can be useful for additional aggregation tasks. Initial results show that it can effectively reduce utility loss when we deploy DP aggregations. To learn more, and see how we apply it for estimating the covariance matrix of a Gaussian distribution, see our <a href=\"https:\/\/proceedings.mlr.press\/v162\/tsfadia22a.html\">paper<\/a>.\n<\/p>\n<h2>Acknowledgements<\/h2>\n<p>\n<em>This work was led by Eliad Tsfadia in collaboration with Edith Cohen, Haim Kaplan, Yishay Mansour, Uri Stemmer, Avinatan Hassidim and Yossi Matias.<\/em><\/p>\n<\/div>\n<p>[ad_2]<br \/>\n<br \/><a href=\"http:\/\/ai.googleblog.com\/2023\/02\/friendlycore-novel-differentially.html\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>[ad_1] Posted by Haim Kaplan and Yishay Mansour, Research Scientists, Google Research Differential privacy (DP) machine learning algorithms<\/p>\n","protected":false},"author":2,"featured_media":349,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[],"class_list":["post-348","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-google-ai"],"_links":{"self":[{"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/posts\/348","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/comments?post=348"}],"version-history":[{"count":1,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/posts\/348\/revisions"}],"predecessor-version":[{"id":2898,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/posts\/348\/revisions\/2898"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/media\/349"}],"wp:attachment":[{"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/media?parent=348"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/categories?post=348"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/tags?post=348"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}