É, gente. Só não estuda quem não tem cérebro e/ou é preguiçoso (e eu
me imagino o que seria um sujeito com “e” ao invés de “ou”, neste caso…). Veja só o caso das boas pesquisas de marketing, não aquelas bobagens que saem nas colunas sociais dos piores jornais do país. Elas são sensacionais! Enquanto estamos aqui reclamando da altíssima produtividade dos chineses, há gente aí fora pesquisando o mercado para poder ganhar um trocado a mais e fazer aquela viagem nas férias. Veja só o caso deste pequeno post sobre uma aplicação em R:
There is a strong positive relationship between demand for quality and willingness to pay, so a product manager might well decide that there was opportunity for at least a high-end and a low-end option. However, there is no natural breaks in the scatterplot. Thus, if this data cloud is a mixture of distinct distributions, then these distributions must be overlapping.
Another example might help. As shown by John Cook, the distribution of heights among adults is a mixture two overlapping normal distribution, one for men and another for women. Yet, as you can observe from Cook’s plots, the mixture of men’s and women’s height does not appear bimodal because the separation between the two distributions is not large enough. If you follow the links in Cook’s post, eventually you will find the paper “Is Human Height Bimodal?”, which clearly demonstrates that many mixtures of distributions appear to be homogeneous. We simply cannot tell that they are mixture by looking just at the shape of the distribution for the combined data. The Old Faithful data with its well-separated bimodal curve provides a nice contrast, especially when we focus only on waiting time as a single dimension (Fitting Mixture Models with the R Package mixtools).
Market segmentation lies somewhere between mass marketing and individual customization. When mass marketing fails because customers have different preferences or needs and customization is too costly or difficult, the compromise is segmentation. We do not need “natural” grouping, but just enough coherence for customers to be satisfied by the same offering. Feet come in many shapes and sizes. The shoe manufacturer can get along with three sizes of sandals but not three sizes of dress shoes. It is not the foot that is changing, but the demands of the customer. Thus, even if segments are no more than convenient fictions, they can be useful from the manager’s perspective.
Finally, what is true for marketing segmentation is true for all of cluster analysis.“Clustering: Science or Art?” (a 2009 NIPS workshop) raises many of these same issues for cluster analysis in general. Videos of this workshop are available atVideolectures. Unlike supervised learning with its clear criterion for success and failure, clustering depends on users of the findings to tell us if the solution is good or bad, helpful or not. On the one hand, this seems to make everything more difficult. On the other hand, it frees us to be more open to alternative methods for describing heterogeneity as it is now and how it evolves over time.
We should not try to minimize the possible confounding effects of asking respondents to repeatedly make choices from sets of alternatives with varying features. This is the problem with within-subject design that Kahneman discusses in his Nobel Prize acceptance speech (pp. 473-474), “They are liable to induce the effect that they are intended to test.” Kahneman views preferences as constructions.(…)Orme sees real preferences that exist independently of the task. Moreover, these preferences are continuous. Choice data does not reveal all that is there because it does not reveal strength of preference.