Search for User Properties is unpredictable

I am trying to use Smart Groups as I try to clean up my database.

I’m trying to understand what it means for a User Properties filter to “begin with” or “end with” a string. I have two user properties defined for all my current listings, with keys “CBLOCATION” and “MEDIAMAIL”, and various values. But I added those fields at different times, and in some cases they were different keys which I later changed. When I create a Smart Group filtering for User Properties “begins with” “C” there are only 25 hits, though it’s clear those hits are weird, since there should be 3000 hits (because of the “CBLOCATION” key).

When I do the same thing for “begins with” “MEDIA” I get 1382 hits. On the one hand, that’s not all of them. On the other hand, the key “MEDIAMAIL” is not “first” in the User Properties.

Similar problems with “ends with”. There is no defined order of a key-value store, I assume, but this is really more of a WTF sort of result, since by random chance the values of these keys should be about half the listings.

On the other hand, “contains” does seem to work in the sense that it searches every key and value for a contiguous string, though occasionally I know there should be a different number of results.

One related practical question: How well does this system work (and is it tested yet) for strings containing (1) spaces, (2) other whitespace, (3) punctuation, (4) non-ASCII characters?

Also important: Is the behavior in Smart Group searches different in any way from the matchers in scripts?

Suggestions:

  1. get rid of the meaningless “begins with” and “ends with” search for User Properties
  2. create a new “with key equal to” and “with value equal to” searches which actually work
  3. not unrelated: Please add a clarification to the docs indicating that only strings with no non-alphanumeric characters can be used as keys and values, or modify the behavior to allow strings with underscore, hyphen, newline, etc. [aside: I am really hoping there is no eval involved here]

An example of what needs clarification:

  • I have a key “MEDIAMAIL” now, but earlier (before I understood that punctuation was bad) I was experimenting with “MEDIA?” for the key name of this boolean, so some listings have that
  • when I search for the explicit string “MEDIA?” (with question mark) it finds all occurrences of the substring “MEDIA” instead
  • now that I am trying to clean things up, I cannot find the broken records, since there is apparently no way to search for the string with the question mark; instead, I am manually paging through the entire database looking for where it pops up in the sidebar form

And a related use case:

I have at least two properties I would like to be boolean values. Because I have to search using “user properties contains”, which does not respect key-value boundaries, I cannot use values like “false” or “no”, and instead need to use unique “truthy/falsey” strings like “yesmedia”/“nomedia”. That way, as long as the other aspects of search/filtering work (see above), I can be sure the record contains what I am looking for.

Ideally, I would be searching (and more importantly, filtering in Smart Groups) for userProperties["CBMEDIA"] == false. As things stand now, I can only filter for “user properties contains "false"”, which is… pretty awful?

Because of the way the ‘user properties’ smart group attribute is implemented on top of existing code originally meant for other parts in GarageSale, its probably not ideally fitted what you are trying to achieve.

You best shot would probably be to use JavaScript to identify listings with certain user properties, like this:

function run(){
  var listingsToDisplay = [];
	for (const listing of allListings) {
	  var cbLocation = listing.userProperties["CBLOCATION"];
		if (cbLocation != null && cbLocation.includes("BK02 33")) {
		  listingsToDisplay.push(listing);
		}
	}
	if (listingsToDisplay.length) {
	  showListings(listingsToDisplay, "The following listings were found:");
	}
}

Alternatively, you could change the tag of listing from within the script and have smart groups filtering for certain tags to show the script’s result.

A little bit of background info:
Right now, smart group attributes are defined at compile time.

This means, there is only a single attribute available for matching user properties.

In particular, the user properties are compiled into a single string, which includes all keys and values, e.g. key1 value1 key2 value2 key3 value3....
This string is then tokenized with the kCFStringTokenizerUnitWordoption (which probably uses white-space and non-letter/numbers characters as word boundaries), and each resulting token is then marked as included in the listings user properties.

Hence, searching for strings containing space characters doesn’t work.

What’s eval?

This will do the job I have at the moment, thank you. Though it is a lot of work for a boolean search for a particular key’s value.

Yes this was my assumption based on the behavior, especially given the random order of keys in most hash/map/dict structures. Remind me what language GS is written in? I might have a slight improvement, if I can translate it into your syntax.

I meant something like “I hope the map is not being converted into a string and then parsed and evaluated to create tokens”, but only the first two are true :slight_smile:

One really important question for this is whether the period character is read as a word break or not (as with numbers, like currency). Similarly, the hyphen/minus sign.

I’ve just checked (and re-checked, because I made a mistake the first time!), and it looks like these are valid (findable) values with Smart Groups:

  • 39.12
  • 39,12
  • $39.12
  • -39.12

Again, these are useful to know, since currency is a 100% crucial thing to be able to store in these fields (so they can be parsed by scripts).

I understand it can be very difficult to refactor something like this, but I do have one idea which might help it be slightly more like the behavior of a regular map, in Smart Groups.

Could the map be converted (but not with the parsing) into strings representing each pair? You would have to select a syntax from the millions of them out there, but it could be something like

  • "(key => value)"
  • "{key value}"
  • "[key:value]"
  • etc.

This would preserve the association between the keys and values, and would also allow “contains” to do all the heavy lifting. As long as it was made explicit in the Smart Groups syntax what the rules were for searching these strings, it would be possible to do things like:

  • User Properties contains “(color => blue)” but not “(musical style => blues)”
  • User Properties contains “(variations => false)” but not “(discount => false)”
  • User Properties contains “fee => 3” [without the delimiters]
  • User properties contains “(consignment => true)” but not “(consignment fee => paid)”
  • and so on

Please give it a thought. It would be a relatively small change in the behavior of the program, and would not involve changing the “real” behavior in scripts. I think, as long as the delimiters are explicit punctuation, it would even work if the keys and values themselves contained them!

Also I think it would not change the current “contains” behavior? Unless I’m misunderstanding the parsing step and its effects. It would still be a single string, just with new syntax inserted for (1) delimiting key-value pairs and (2) dividing key from value.

But it would definitely help Smart Groups to be more useful for simple view-building.

Also, if you eliminated the parsing and simply did a concatenation (perhaps with cleanup for line breaks, just in case) it would still be robust?

Are you sure that this is correct? I just put in test => 39.12 39,12 $39.12 -39.12 as a property, and Apple’s tokenizer stripped currency and minus signs and only returned :

  [0] = "39,12"
  [1] = "39.12"
  [2] = "test"

This behavior might also be dependent on your language and currency settings. :man_shrugging:

Screenshot 2023-04-05 at 17.05.36

Oh that is interesting. I hadn’t tried a large string with spaces in it. I was just trying the items one at a time.

Here, I can definitely put just each of the single strings in as the value, and (US settings with some other languages installed) GarageSale’s Smart Group search for “contains” will discover all four items as a match. I can also find “€39.12” btw.

I can also enter the value you tested, and the Smart Group will find each of the four “words” in the value, which is honestly all I really need to work at the moment.

I suppose one thing I could spend some time working with might be date strings, but everybody knows that way lies madness. (but it works, at least with ISO 8601 dates)

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.