Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: database/sql: add methods to scan an entire row into one value #61637

Open
jba opened this issue Jul 28, 2023 · 57 comments
Open

proposal: database/sql: add methods to scan an entire row into one value #61637

jba opened this issue Jul 28, 2023 · 57 comments
Labels
Milestone

Comments

@jba
Copy link
Contributor

jba commented Jul 28, 2023

Edited: struct field names are matched to columns case-sensitively.
Edited: untouched storage is ignored rather than zeroed.


I propose adding the method

ScanRow(dest any) error

to the Row and Rows types. ScanRow attempts to populate dest with all the columns of the current row. dest can be a pointer to an array, slice, map or struct.

Motivation

ScanRow makes it more convenient to populate a row from a struct. Evidence that this is a desirable feature comes from the github.com/jmoiron/sqlx package, which has 9,800 importers, about 1,000 forks and about 14,000 GitHub stars. (To be fair, sqlx provides several other enhancements to database/sql as well.) The ScanRow method brings database/sql closer to parity with encoding/json, whose ability to unmarshal into structs and other data types is widely used.

Another motivation comes from the likely addition of iterators to the language. Without something like ScanRow, an iterator over a DB query would have to return a Row or Rows, since the Scan method is variadic. An iterator like that still improves on using a Rows directly because it makes error-handling more explicit and always calls Close. But we could do better with ScanRow, because the iterator could deliver a single value holding the entire row:

type Product struct {
    Name string
    Quantity int
 }

func processProducts() {
    for p, err := range sql.Query[Product](ctx, db, "SELECT * FROM products") {
       if err != nil {...}
       // use p
    }
}

This proposal doesn't include that iterator; I show it merely for motivation.

Details

ScanRow acts as if each part of its argument were passed to Scan.

If the value is an array pointer, successive array elements are scanned from the corresponding columns. Excess columns are dropped. Excess array elements are left untouched.

If the value is a slice pointer, the slice is resized to the number of columns and slice elements are scanned from the corresponding columns.

If the value is a map pointer, the underlying key type must be string. For each column, a map entry is created whose key is the column name and whose value is scanned from the column. Unnamed columns are assigned unique keys of the form _%d for integers 0, 1, .... Other entries in the map are left untouched.

If the value is a struct pointer, its exported visible fields are matched by name to the column names and the field values are scanned from the corresponding columns. The name matching is done as follows:

  1. If the field has a struct tag with key "sql", its value is the column name.
  2. Otherwise, the column name is matched to the field name case-sensitively.

Unassigned columns are dropped. Unassigned fields are left untouched.

@jba jba added the Proposal label Jul 28, 2023
@gopherbot gopherbot added this to the Proposal milestone Jul 28, 2023
@jba
Copy link
Contributor Author

jba commented Jul 28, 2023

Maybe we should be more strict about struct name matching to prevent misspelling errors. Unfortunately that interferes with the ability to use the same struct for different partial views of its table.

@doggedOwl
Copy link

doggedOwl commented Jul 28, 2023

Otherwise, the column name is matched to the field name case-insensitively.

Maybe a snake_case match would be better as a default.
When field names are composed of more words most db conventions use snake_case for column names.
FieldName->field_name

Also most projects i have seen in use already use 'db' Key in struct tags instead of 'sql'. If this could use the same there would be less work to convert existing projects.

Edit: meant snake_case (camelCase columns would already match with the case-insentive match)

@Sinea
Copy link

Sinea commented Jul 28, 2023

@doggedOwl db is kind of generic.sql is more clear. And unpacking columns to fields by a naming convention doesn't seem like the "go" way of doing things.

@PaluMacil
Copy link

Case insensitivity might be problematic. If you have a database collation set to case sensitive, I believe SQL Server and MySQL will consider column names with two casings to be different columns. This is not all databases, and not all usages of the databases it affects. Also, you can already alias each field in your select statement, or you can use tags as an exact match. To be specific. I don't think trying to be clever about case insensitivity would be a good idea for databases that act different from each other.

@PaluMacil
Copy link

@doggedOwl db is kind of generic.sql is more clear. And unpacking columns to fields by a naming convention doesn't seem like the "go" way of doing things.

Another benefit of "sql" as the tag is matching the package it comes from. That seems both concise and intuitive.

@Merovius
Copy link
Contributor

Personally, I'd far prefer scanning into a struct to scan the columns into fields in order - or at least having the option to. I don't particularly like the mechanism of using struct tags to control unmarshaling anyways and generally prefer my exported types to be tag-free. This would also allow scanning unnamed columns into structs.

Even if the iterator is not part of the proposal, given that it's part of the rationale, I'd also mention that I find the semantics of returning an error as an iteration value a bit confusing. It is not enforced that iteration stops if an error is returned. Arguably, that's a good thing - it might be, that a specific row can't be scanned into a value, but for some use cases, it might still make sense to iterate over the rest of the query. On the other hand, the ambiguity will potentially have people write incorrect code, if they assume one semantic or the other.

@benhoyt
Copy link
Contributor

benhoyt commented Jul 30, 2023

I'd far prefer scanning into a struct to scan the columns into fields in order

@Merovius Isn't that rather error-prone? If you change the order of fields in the select, but forget in the struct (which might be defined a ways away), it'll break.

I don't particularly like the mechanism of using struct tags to control unmarshaling anyways and generally prefer my exported types to be tag-free. This would also allow scanning unnamed columns into structs.

Out of interest, how do you unmarshal JSON with this approach? Or do you unmarshal into a non-exported struct with tags, and then copy to an exported struct without tags?

In terms of the proposal itself, I really like it -- I've wanted this for many years (and have used sqlx too). Code like rows.Scan(&item.ID, &item.Name, &item.TimeCreated, ...) gets tedious really fast (and is error-prone for the same ordering reason as above).

@jba For structs, will embedded fields be supported, like they are in sqlx? What will ScanRow do in the face of ambiguous column names, for queries like SELECT 1 AS a, 2 AS a or SELECT a.id, a.name, b.id, b.name FROM foos AS a JOIN foos AS b ON a.parent = b.id?

@Merovius
Copy link
Contributor

@benhoyt I don't know if it's error-prone. I don't necessarily "buy" theoretical arguments about error-proneness. But AIUI, sqlx works that way already, so maybe the maintainers or users of that package can provide some data.

As for how I unmarshal JSON without tags: I declare an UnmarshalJSON method that unmarshals into an unexported struct type and then either converts it to the tagless one or copies over the fields I'm interested in. That's a bit of added boilerplate, but mostly because the encoding/json package relies on struct tags to begin with - I'd prefer an API that allows you to pass a function doing what the struct tags currently do. How that might look is off-topic here, IMO.

@derekperkins
Copy link

derekperkins commented Jul 31, 2023

If we're going to add ScanRow(dest any) error, I would extend the proposal to also add ScanRows(dest any) error. If we can scan one row, it's not much more effort to scan all of them, appending to the provided slice, which would reduce a lot of boilerplate. The corollary in sqlx is Select, where ScanRow maps to Get, and just like encoding/json can Unmarshal a single object or an array of objects, this follows the same pattern.

@jba
Copy link
Contributor Author

jba commented Aug 1, 2023

@Merovius Unless they are doing fine-grained memory optimization, people think of structs as unordered sets of fields. As does the language: re-ordering fields is never a breaking change.

As for iterators, there doesn't seem to be any appetite for error handling over on #61405, so I think having error as the second return type is the best we're going to get.

@jba
Copy link
Contributor Author

jba commented Aug 1, 2023

@benhoyt Embedded fields will be supported. The implementation will use reflect.VisibleFields.

Personally, I think ScanRow should fail if there are ambiguous column names, but that's open for discussion.

@Merovius
Copy link
Contributor

Merovius commented Aug 1, 2023

@jba Not to derail this, but: Re-ordering fields is (sometimes) a breaking change. It affects convertibility. And FWIW, the language also considers order for unkeyed literals (which the Go 1 compatibility promise makes an exception for) and comparisons.

@jba
Copy link
Contributor Author

jba commented Aug 1, 2023

@derekperkins I would implement ScanRows using more general tools:

func CollectErr[T any](it Iter2[T, error]) ([]T, error) {
    var ts []T
    for t, err := range it {
        if err != nil {
            return nil, err
        }
        ts = append(ts, t)
    }
    return ts, nil
}
// ...

products, err := CollectErr(sql.Query[Product](ctx, db, "SELECT ...")) 

@jba
Copy link
Contributor Author

jba commented Aug 1, 2023

@Merovius good point about re-ordering fields. I will have to revisit apidiff....

@derekperkins
Copy link

@jba I can't tell from your response, are you opposed to adding a ScanRows option? I think the majority use case for querying is to iterate and scan all rows before continuing. If we're thinking about adding ScanRow as a convenience function, adding ScanRows or something like it so users don't even have to iterate manually seems like an easy win.

@jba
Copy link
Contributor Author

jba commented Aug 3, 2023

@derekperkins I am opposed because I think it's a one-liner given CollectErr. We've been talking about slices.Collect, so I'm hopeful that CollectErr will end up in slices, but even if it doesn't it's an obvious candidate for everyone's toolbox.

@derekperkins
Copy link

While I was unaware of slices.Collect conversations, I maintain that ScanRows would still be a worthwhile addition. There are multiple cases, like strings.Cut or strings.ReplaceAll, that just proxy other functions in the same pkg, but were added because that was the primary usage.

@Merovius
Copy link
Contributor

Merovius commented Aug 4, 2023

FWIW I don't think that if we have iterators, collecting all rows into a slice will be the primary use. I think it might be today, but mainly because slices are the most convenient way to return ordered collections.

@PaluMacil
Copy link

@Merovius I'm aware of your disagreement due to our discussion in the removal of keys and values from the map package, but I think it bears referencing again with acknowledgment that you are "technically" correct: Iterators being better in many cases doesn't make them preferred. There are countless places in code in languages with iterators where people choose to use a list-like structure when performance doesn't matter simply because it's simpler for a lot of people to think about. For instance, people need to worry if an interrater is repeatable. They need to worry about it they are iterating in multiple places and actually should have dropped it into a list/slice already because they are now actually being less efficient, and since most computing these days (outside libraries) is written in boring IT departments writing code without a lot of load, I'm guessing performance isn't a big issue.

@derekperkins
Copy link

There are countless places in code in languages with iterators where people choose to use a list-like structure when performance doesn't matter

My thought exactly. For the vast majority of CRUD style / db to api transformations, there's not going to be an db rows iterator => json encoder => http.ResponseWriter chain, even if that might be the most efficient. That spans multiple logical tiers, and makes it harder to integrate with logging / error reporting toolchains. IMO, the general expectation is a reasonable number of bounded rows that will just get put into a slice, though obviously that's just my projection.

@earthboundkid
Copy link
Contributor

My experience in Python was that iterators were slower than using lists, at least for small lists. I don’t know if that experience would follow over in Go. It’s worth benchmarking to get a rough sense for it. It might also have different variability, which can be more important than pure performance per se.

@willfaught
Copy link
Contributor

Unassigned fields get their zero value.

Shouldn’t this be “are left alone”? If they already have a nonzero value, they should keep it, unless there are explicit values for them.

@jba
Copy link
Contributor Author

jba commented Aug 7, 2023

@willfaught That was deliberate. Leaving the existing state sounds error-prone to me. Especially since you may use the same struct for various queries that only select some of the fields.

What's a good use case for leaving the existing values alone?

@derekperkins
Copy link

Unassigned columns are dropped

Should this be configurable? Personally I would prefer this to error, because that probably means there was a typo in a field mapping. Your suggestion makes sense in the context of encoding/json silently discarding fields by default, though there was DisallowUnknownFields support added in 1.10. While there is value in meeting some of those expectations, my experience is that often json mapping are for apis out of your control, so you don't want to break if a field is added, but a database call seems fundamentally different.

What's a good use case for leaving the existing values alone?

This might be another place for configuration, as both options have their place. Reasons I can think of to leave them alone:

  1. Performance penalty: row scanning is often a hot path, so reflecting across fields to set them to their zero value could have a meaningful cost. You could always drop down to the original rows.Scan pattern, but I would hope that this could be an abstraction with minimal overhead
  2. Filling extra fields: it's not uncommon in our code base to set an id or some base set of fields in a struct, then pass it into a sql function to "fill" the remainder. Since the id is already set, we usually don't query the id back out. Forcing field clearance would require either manually copying values to the original struct, or re-populating fields we already have from the db.

It's not clear from the json docs, but are all fields zeroed out during unmarshaling?

@kardianos
Copy link
Contributor

I like the high level proposal of being able to pull an entire row at a time. I like to buffer my results first that does something similar in https://pkg.go.dev/github.com/golang-sql/table . In the SQL interface I actually use day-to-day, I have a Get method https://pkg.go.dev/github.com/kardianos/rdb#Result.Get on it, but I again almost never work with an open result set, I always buffer the results first in https://pkg.go.dev/github.com/kardianos/rdb@v1.1.6/table .

For myself, I hate relying on any result ordering. I deal with queries that often return 10-30 columns, or more for wide table. To be blunt, I think any API that relies on ordering/index of columns is just waiting for hard to detect bugs to creep in.

The persistent problem in the current API is type conversion, especially in Scan. If we can into structs or other name matching, we have a name matching to do as well. I suggest that this proposal be augmented to include a way to inject, either at the database pool level (*DB) and/or at the *Rows level, some way to specify additional type conversions and name matching or value mapping method. Perhaps they could be the same API, so a custom function could be passed a list of column names, values, and an value pointer to map them to. If this was passed at the scan site, it could look like this:

    qOpt := sql.Q{
        SQL: "SELECT * FROM products",
        Map: func(columns []string, values []any, v any) error {
            // Map to value v.
            return nil
        },
    }
    for p, err := range sql.Query[Product](ctx, db, qOpt) {
       if err != nil {...}
    }

The map function could be standard, and could have a default implementation. We could also expose the default conversion routines or hooks into them. Note, that this implementation is not statefull and thus probably not very efficient, because the column names would need to be looked up each time, thus a better mapper interface would be needed.

  1. I strongly dislike ordinal value mapping.
  2. Type conversion should be addressed in the same breath as this.
  3. I'm strongly in favor of having a reasonable default implementation for mapping types and names, but we need a way where an application that uses a different value or type can incrementally ad them.

NOTE: Why is type conversion so important? Often times a DB driver supports values of type X (date or decimal or specific UUID type). Your application uses type Y of the same concept, but different implementation. Without integrated conversions you are stuck with two types working everywhere, rather then dealing with in once. To be honest, this is also a problem with query parameters.

NOTE2: civil Date, DateTime, Decimal, and UUIDs are some common types that can be frustrating to map, at least for myself.

@derekperkins
Copy link

The persistent problem in the current API is type conversion, especially in Scan

To clarify, this is mainly an issue because you can't implement Scanner/Valuer on types you don't own, right? We definitely have that issue, and end up with a lot of repeated manual handling

@mvdan
Copy link
Member

mvdan commented Sep 12, 2023

@jba are there plans to expose this API somewhere like x/exp before it ships in a Go release? I don't know database/sql well enough to tell whether that's possible or a good idea, but I'd like to test this out in some projects without needing to wait. Not just for this to be accepted and ship in a Go release, but also for another Go release to come and go, since most projects support two Go releases.

@jba
Copy link
Contributor Author

jba commented Sep 12, 2023

Case insensitivity might be problematic... (#61637 (comment))

@PaluMacil, I changed the proposal to make field matching case-sensitive.

If anyone disagrees, please address the arguments in the above comment.

@jba
Copy link
Contributor Author

jba commented Sep 12, 2023

@mvdan, we obviously can't add a ScanRow method, but it should be possible to write it as a function. I'll give that a shot. Stay tuned.

@eliben
Copy link
Member

eliben commented Sep 12, 2023

@PaluMacil I would like to understand the case-sensitivity concern better. Can you elaborate on the problematic scenario you envision here?

IIUC, you're worried that a database will have two columns with names that differ only in their case, and then this new method won't know how to do the scan?

The way the json package does this is "preferring an exact match but also accepting a case-insensitive match". Could we do the same here? In (hopefully rare) cases where tables do have columns with names that only differ by case, struct tags can fix things up. ISTM that the much more common case is unique column names, and since there's a natural discrepancy between exported field names in Go and how table columns are named, this can save a lot of tedious struct fields just for the sake of amending the case.

@PaluMacil
Copy link

@PaluMacil I would like to understand the case-sensitivity concern better. Can you elaborate on the problematic scenario you envision here?

IIUC, you're worried that a database will have two columns with names that differ only in their case, and then this new method won't know how to do the scan?

The way the json package does this is "preferring an exact match but also accepting a case-insensitive match". Could we do the same here? In (hopefully rare) cases where tables do have columns with names that only differ by case, struct tags can fix things up. ISTM that the much more common case is unique column names, and since there's a natural discrepancy between exported field names in Go and how table columns are named, this can save a lot of tedious struct fields just for the sake of amending the case.

If that's consistent with serialization for JSON, then I think that seems like a great approach. I would expect this to be an extremely uncommon issue

@jba
Copy link
Contributor Author

jba commented Sep 21, 2023

preferring an exact match but also accepting a case-insensitive match

I think the only reason for the preference would be if there were two fields or columns that differed only by case. That seems wildly unlikely for DB columns. I would prefer being case-insensitive by default, and requiring an exact match for names specified with a field tag.

@jba
Copy link
Contributor Author

jba commented Sep 21, 2023

I've been thinking about adding options to this API, as @derekperkins suggested above.

In go 1.10, the json.Decoder type got the DisallowUnknownFields. That grew out of a desire for stricter JSON parsing. It was possible to make that change because adding a method to a concrete type is backwards compatible. There is no such opportunity for this proposal; to add configuration, we'd have to add a new method (ScanRowOpt?).

So let's anticipate that by changing the signature to

func ScanRow(dest any, opts *ScanRowOptions) error

At this point everyone says "I hate passing nil, let's use functional options," but:

  1. Using an options struct is consistent with the database/sql package (see TxOptions) and with the standard library as a whole.
  2. When the compiler complains because you forgot the nil, you have a chance to learn about how you can call this function in a way that might be better for you. (And if you're a reader or reviewer, that nil might pique your curiosity.) If the options are a variadic arg, you won't know about them unless you read the docs.
  3. Functional options have a lot of "API overhead": you need an option type and a function for each option. An options struct is just one line in the API index no matter how many fields it has.

The one option I'm thinking about would be the same as for encoding/json: whether additional columns should be allowed or cause an error. If we followed that precedent then the default would be to allow them, but I think it's better to start strict, and let the user relax the semantics when they realize they need to. (As pointed out in the comment linked above, the most likely bug is misspelling a field name.) Thus:

type ScanRowOptions struct {
    // If true, do not return an error if the result set has a column name that does not match
    // any field in the struct, or if it has more columns than slice elements.
    AllowUnusedColumns bool
}

We could also add an option for unmatched struct fields. But I'm less sure about that because we've seen that there are many valid uses for having additional fields, and if the field is unmatched because of a typo, ScanRow will fail by default because there will also be an unused column.

@kardianos
Copy link
Contributor

I just pushed a change to my table package that allows querying into struct slices and converting table buffers into struct slices: https://github.com/golang-sql/table/blob/master/struct.go

Still a few TODOs.

One thing to note, it may be better to opt into disallowing unknown struct fields globally, and make it a practice, if you want to just ignore a few fields, to use a tag such as sql:"-" . Name mapping can be a large source of bugs.

@doggedOwl
Copy link

doggedOwl commented Sep 21, 2023

If the options are important than maybe introducing a RowScanner and attaching options to it would be better than specifing options on each call of ScanRow. I can't think of any scenario where the options would change from one ScanRow to the next while looping through all the rows so that extra parameter creates too much noise for the value it provides at that point.
Notice that also in the json the options are placed in the Decoder not on every Decode method call.

That said I'm not convinced of this particular need at all. especially if the default is to disallow unknown fields it will force people into more occasions for errors in typing. Instead of "Select * " they will need to write a long list of exact field matches that is in fact more probable to contain errors.

@derekperkins
Copy link

it may be better to opt into disallowing unknown struct fields globally, and make it a practice, if you want to just ignore a few fields, to use a tag such as sql:"-"

I'd support setting this at the DB / Tx / Stmt level

@kardianos
Copy link
Contributor

That said I'm not convinced of this particular need at all. especially if the default is to disallow unknown fields it will force people into more occasions for errors in typing. Instead of "Select * " they will need to write a long list of exact field matches that is in fact more probable to contain errors.

I don't think so. Think of the list of columns from the query and the struct as two distinct tables. When joining together, there may be unused column from the query, and unused fields in the struct. We don't actually care about unused columns from the query: If you query select ID, Name, Birthdate as BAB and try to scan into a struct with the fields: ID, and DOB (see typo in query), then the DOB will still throw an error. In fact, this should be the default. If you scan into a struct with extra fields, I would go so far as to say, any struct that you want to ignore should have sql:"-" set on the field tag, and there should be no query level option. The "Name" column is obvious to inspection that it isn't present. We don't need to report that. Thus, you can still run select * from Table; if you want to, extra query columns are fine. Only extra struct columns are a problem.

@rowanseymour
Copy link

rowanseymour commented Sep 21, 2023

Requiring struct fields not in the returned column set to be annotated, assumes that a struct is only fetched from a database in only one way and that's not how we personally have been doing it with sqlx. Besides scanning the RETURNING columns from an INSERTed struct into the same struct, we also sometimes want to refresh/fetch a limited set of fields on a struct from the database and leave other struct fields untouched. I definitely lean toward the simplicity of ignoring struct columns not included in the results, and that's consistent with json unmarshaling.

@derekperkins
Copy link

if you want to, extra query columns are fine. Only extra struct columns are a problem.

I feel exactly the opposite, and anything we can do to disincentivize SELECT * is good. :) I don't mean that in an argumentative way, but as extra confirmation that we need to be able to configure behavior, as both behaviors are reasonable tradeoffs.

@doggedOwl
Copy link

Disincetivising "Select *" if that is important to you can be done on your linter. No need to add Api complexity for that.

@derekperkins
Copy link

derekperkins commented Sep 21, 2023

Sorry that my tongue in cheek remark didn't come across. This isn't about query patterns or whether you use SELECT *, we're trying to help people get the data they think they're getting. The two scenarios are:

  1. SELECT has more fields than matching destinations = discarded data
  2. SELECT has fewer fields than matching destinations = possibly missing data

There's no obvious "correct" behavior in the absence of an exact match of fields to destinations. Some people will want an error for one or both cases, and some people will want to silently succeed, like the default json configuration. Today, both error in stdlib usage because you have to manually map fields to destinations, and only exact matches are allowed.

The real question is what is the right api design to enable those choices to be made by users, and what the default should be. Probably the most expected default is to choose whatever happens in sqlx, if we expect users to switch, but that's up for debate.

@rowanseymour
Copy link

rowanseymour commented Sep 22, 2023

I like the sound of defaults being consistent with sqlx and not just because I'm lazy and want an easy migration path...

  1. Gives an error like missing destination name foo in *struct - I like this because there's no good argument for fetching more data from the database than you intend to use.
  2. Silently ignores - many use cases require fetching of partial data

@oakad
Copy link

oakad commented Sep 22, 2023

I would say that what is needed is a relatively simple interface for iterating over column values for each row. In fact, Scan does exactly that: it iterates over column values. A similar function can be easily devised taking a callback styled consumer as an argument which will receive each column value in order.

If direct iteration over raw column values was supported, advanced use cases will be trivial to implement in the third party libraries.

pgx library has an additional method on the Row type, namely Values (https://github.com/jackc/pgx/blob/b301530a5fd50fe8ee46149fec58162a52bed8c6/rows.go#L286). This is somewhat better than Scan for ORM style wrappers, but again, directly exposing nextColumn would make even more sense and lead to better performance.

@earthboundkid
Copy link
Contributor

In fact, Scan does exactly that: it iterates over column values.

I think if users have advanced needs, they should just use Scan.

@oakad
Copy link

oakad commented Sep 22, 2023

Scan is very inconvenient both performance and coding wise, that's why we have this proposal and a whole bunch of ORM libraries doing a lot of heavy lifting around reflect.

On modern cloud, databases and network are very fast. But on the Go side, the driver and wrapers take their sweet time doing a dozen allocations to simply forward a "select" result set to user. :-)

For a "server" language, Go SQL facilities are surprisingly limited.

@kardianos
Copy link
Contributor

I would say that what is needed is a relatively simple interface for iterating over column values for each row. In fact, Scan does exactly that: it iterates over column values. A similar function can be easily devised taking a callback styled consumer as an argument which will receive each column value in order.

If direct iteration over raw column values was supported, advanced use cases will be trivial to implement in the third party libraries.

pgx library has an additional method on the Row type, namely Values (https://github.com/jackc/pgx/blob/b301530a5fd50fe8ee46149fec58162a52bed8c6/rows.go#L286). This is somewhat better than Scan for ORM style wrappers, but again, directly exposing nextColumn would make even more sense and lead to better performance.

This I fully agree with. In order to get values out of Scan today, I have to do this:
https://github.com/golang-sql/table/blob/master/tablebuffer.go#L170

I do agree that the sql still isn't great. This would be a separate proposal, it is only related in that it could make it more efficient to get data out of the scan. I used Scan at first, but it quickly became... unacceptable on a number of fronts. I don't like the Scan style API. If such a Rows.Values proposal was made, my only question would be if it would return:

  • []any
  • []NamedValue
  • Pass in a func([]any) error to control the life span, where any byte value would copied out when the func exits.

Either way, a []any slice can be obtained from the Scan (see link), so this would be an optimization.

@dsnet dsnet pinned this issue Sep 22, 2023
@dsnet dsnet unpinned this issue Sep 23, 2023
@dsnet

This comment was marked as off-topic.

@benhoyt
Copy link
Contributor

benhoyt commented Dec 20, 2023

For what it's worth, here is a cut-down version of ScanRow implemented as a function which takes an *sql.Rows. It only handles dest as a struct, and it doesn't support embedded fields (probably not too hard to add). However, it's already proved useful for me, and saved a bunch of error-prone boilerplate, for example, on a query with 28 fields.

@flibustenet
Copy link

@jba would we wakeup this proposal for 1.23 ?

@jba
Copy link
Contributor Author

jba commented Jan 24, 2024

Sure.

@benhoyt
Copy link
Contributor

benhoyt commented Mar 19, 2024

Could this proposal be added to the list of active proposals for review? It's well thought out and has lots of upvotes.

@flibustenet
Copy link

Link to "what would a Go 2 version of database/sql look like?" #22697

@achille-roussel
Copy link
Contributor

@jba the iterator-based model you showed as example looks really similar to what I put together in this package a few months ago https://github.com/achille-roussel/sqlrange

In particular, the sqlrange.Scan function seems to combine what you are presenting with sql.ScanRow and iterators. The range function form addresses a lot of the performance concerns because the initialization cost can be amortized over all the rows being scanned. The added safety you mentioned of automatically closing the *sql.Rows is also a net benefit in my opinion.

I'm just sharing here because it seemed relevant to the discussion.

Thanks for driving this proposal!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests